apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.66k stars 1.03k forks source link

DocValuesProducer#ramBytesUsed throws ConcurrentModificationException [LUCENE-5443] #6506

Closed asfimport closed 10 years ago

asfimport commented 10 years ago

this came up in an elasticsearch issue that if you pull #ramBytesUsed() while docvalues are loaded in a seperate thread you see a ConcurrentModificationException here is an example:

Caused by: java.util.ConcurrentModificationException
        at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926)
        at java.util.HashMap$ValueIterator.next(HashMap.java:954)
        at org.apache.lucene.codecs.lucene45.Lucene45DocValuesProducer.ramBytesUsed(Lucene45DocValuesProducer.java:291)
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsReader.ramBytesUsed(PerFieldDocValuesFormat.java:308)
        at org.apache.lucene.index.SegmentDocValues.ramBytesUsed(SegmentDocValues.java:103)
        at org.apache.lucene.index.SegmentReader.ramBytesUsed(SegmentReader.java:555)

Migrated from LUCENE-5443 by Simon Willnauer (@s1monw), resolved Feb 13 2014 Attachments: LUCENE-5443.patch

asfimport commented 10 years ago

Shai Erera (@shaie) (migrated from JIRA)

I see that in Lucene45DVP, all access to addressInstances is guarded, so I think we should guard ramBytesUsed too. Likewise for ordIndexInstances. But when I look at Lucene4DVP, ramBytesUsed is an AtomicLong, and updated whenever a new DV is added ... can't we do the same? Also, looks like the computation of ramBytesUsed is slightly wrong, as it uses Integer.SIZE which is the number of bits in an int, not bytes.

asfimport commented 10 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Also, looks like the computation of ramBytesUsed is slightly wrong, as it uses Integer.SIZE which is the number of bits in an int, not bytes.

It should use RamUsageEstimator#NUM_BYTES_INT - this constant is 4

asfimport commented 10 years ago

Shai Erera (@shaie) (migrated from JIRA)

Right. I can create a patch a bit later, unless someone beats me to it...

asfimport commented 10 years ago

Shai Erera (@shaie) (migrated from JIRA)

Add AtomicLong ramBytesUsed and compute it whenever a new buffer is added to the maps. I think it's ready!

asfimport commented 10 years ago

Adrien Grand (@jpountz) (migrated from JIRA)

+1 to commit

asfimport commented 10 years ago

ASF subversion and git services (migrated from JIRA)

Commit 1567954 from @shaie in branch 'dev/trunk' https://svn.apache.org/r1567954

LUCENE-5443: DocValuesProducer.ramBytesUsed throws ConcurrentModificationException

asfimport commented 10 years ago

ASF subversion and git services (migrated from JIRA)

Commit 1567960 from @shaie in branch 'dev/branches/branch_4x' https://svn.apache.org/r1567960

LUCENE-5443: DocValuesProducer.ramBytesUsed throws ConcurrentModificationException

asfimport commented 10 years ago

Shai Erera (@shaie) (migrated from JIRA)

Committed to trunk and 4x.