openshiftio / openshift.io

Red Hat OpenShift.io is an end-to-end development environment for planning, building and deploying modern applications.
https://openshift.io
97 stars 66 forks source link

Caused by: java.io.IOException: No space left on device #1298

Closed tuxdna closed 6 years ago

tuxdna commented 6 years ago

When scheduling scan of maven packages using Jobs API, following error is encountered in the staging cluster:

INFO:f8a_jobs.handlers.base:Fetching pre-built maven index from S3, if available.
[main] INFO com.redhat.maven.index.checker.MavenIndexChecker - Updating Index...
[main] INFO com.redhat.maven.index.checker.MavenIndexChecker - This might take a while on first run, so please be patient!
[main] INFO com.redhat.maven.index.checker.MavenIndexChecker -   Downloading nexus-maven-repository-index.properties
[main] INFO com.redhat.maven.index.checker.MavenIndexChecker -  - Done
[main] INFO com.redhat.maven.index.checker.MavenIndexChecker -   Downloading nexus-maven-repository-index.gz
[main] INFO com.redhat.maven.index.checker.MavenIndexChecker -  - Done
[pid: 1|app: 0|req: 2752/2752] 10.1.2.1 () {30 vars in 380 bytes} [Tue Nov  7 10:10:21 2017] GET /api/v1/readiness => generated 3 bytes in 2 msecs (HTTP/1.1 200) 2 headers in 70 bytes (1 switches on core 0)
2017-11-07 10:10:21,397 - Liveness probe - trying retrieve stored jobs from database using scheduler
WARNING:f8a_jobs.api_v1:Liveness probe - trying retrieve stored jobs from database using scheduler
2017-11-07 10:10:21,397 - Liveness probe - trying to schedule the livenessFlow
WARNING:f8a_jobs.api_v1:Liveness probe - trying to schedule the livenessFlow
2017-11-07 10:10:21,398 - Scheduling Selinon flow 'livenessFlow' with node_args: 'None', job 'None'
DEBUG:f8a_jobs.handlers.base:Scheduling Selinon flow 'livenessFlow' with node_args: 'None', job 'None'
2017-11-07 10:10:21,506 - Liveness probe - finished
WARNING:f8a_jobs.api_v1:Liveness probe - finished
[pid: 1|app: 0|req: 2753/2753] 10.1.2.1 () {30 vars in 378 bytes} [Tue Nov  7 10:10:21 2017] GET /api/v1/liveness => generated 3 bytes in 111 msecs (HTTP/1.1 200) 2 headers in 70 bytes (1 switches on core 0)
2017-11-07 10:11:21,416 - Liveness probe - trying retrieve stored jobs from database using scheduler
WARNING:f8a_jobs.api_v1:Liveness probe - trying retrieve stored jobs from database using scheduler
2017-11-07 10:11:21,417 - Liveness probe - trying to schedule the livenessFlow
WARNING:f8a_jobs.api_v1:Liveness probe - trying to schedule the livenessFlow
2017-11-07 10:11:21,417 - Scheduling Selinon flow 'livenessFlow' with node_args: 'None', job 'None'
DEBUG:f8a_jobs.handlers.base:Scheduling Selinon flow 'livenessFlow' with node_args: 'None', job 'None'
2017-11-07 10:11:21,493 - Liveness probe - finished
WARNING:f8a_jobs.api_v1:Liveness probe - finished
[pid: 1|app: 0|req: 2754/2754] 10.1.2.1 () {30 vars in 376 bytes} [Tue Nov  7 10:11:21 2017] GET /api/v1/liveness => generated 3 bytes in 78 msecs (HTTP/1.1 200) 2 headers in 70 bytes (1 switches on core 0)
[pid: 1|app: 0|req: 2755/2755] 10.1.2.1 () {30 vars in 380 bytes} [Tue Nov  7 10:11:21 2017] GET /api/v1/readiness => generated 3 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 70 bytes (1 switches on core 0)
Exception in thread "Lucene Merge Thread #1" org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: No space left on device
    at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
    at org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:403)
    at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:117)
    at org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:106)
    at org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:95)
    at org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:52)
    at org.apache.lucene.store.RAMOutputStream.writeTo(RAMOutputStream.java:69)
    at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlock(BlockTreeTermsWriter.java:962)
    at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:759)
    at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:547)
    at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
    at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
    at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1039)
    at org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:166)
    at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:112)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4132)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3728)
    at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
[main] ERROR org.apache.maven.index.updater.DefaultIndexUpdater - Fallback to *.zip also failed: java.io.FileNotFoundException: Resource nexus-maven-repository-index.zip does not exist
Exception in thread "main" java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
    at org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:403)
    at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:117)
    at org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:106)
    at org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:95)
    at org.apache.lucene.codecs.compressing.LZ4.encodeLiterals(LZ4.java:157)
    at org.apache.lucene.codecs.compressing.LZ4.encodeLastLiterals(LZ4.java:162)
    at org.apache.lucene.codecs.compressing.LZ4.compress(LZ4.java:252)
    at org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:161)
    at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:233)
    at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.finishDocument(CompressingStoredFieldsWriter.java:163)
    at org.apache.lucene.index.StoredFieldsProcessor.finishDocument(StoredFieldsProcessor.java:128)
    at org.apache.lucene.index.TwoStoredFieldsConsumers.finishDocument(TwoStoredFieldsConsumers.java:65)
    at org.apache.lucene.index.DocFieldProcessor.finishDocument(DocFieldProcessor.java:262)
    at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:271)
    at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:465)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1537)
    at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1207)
    at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1188)
    at org.apache.maven.index.updater.IndexDataReader.readIndex(IndexDataReader.java:94)
    at org.apache.maven.index.updater.DefaultIndexUpdater.unpackIndexData(DefaultIndexUpdater.java:424)
    at org.apache.maven.index.updater.DefaultIndexUpdater.loadIndexDirectory(DefaultIndexUpdater.java:215)
    at org.apache.maven.index.updater.DefaultIndexUpdater.access$300(DefaultIndexUpdater.java:87)
    at org.apache.maven.index.updater.DefaultIndexUpdater$LuceneIndexAdaptor.setIndexFile(DefaultIndexUpdater.java:557)
    at org.apache.maven.index.updater.DefaultIndexUpdater.fetchAndUpdateIndex(DefaultIndexUpdater.java:786)
    at org.apache.maven.index.updater.DefaultIndexUpdater.fetchAndUpdateIndex(DefaultIndexUpdater.java:171)
    at com.redhat.maven.index.checker.MavenIndexChecker.perform(MavenIndexChecker.java:314)
    at com.redhat.maven.index.checker.MavenIndexChecker.main(MavenIndexChecker.java:153)
WARNING:f8a_worker.utils:command ['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200'] ended with 1

ERROR:f8a_worker.utils:exception is fatal
2017-11-07 10:11:40,391 - Error during running command ['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200']: ''
Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/f8a_worker/utils.py", line 268, in get_command_output
    out = check_output(args, universal_newlines=True, **kwargs)
  File "/usr/lib64/python3.4/subprocess.py", line 617, in check_output
    raise CalledProcessError(retcode, process.args, output=output)
subprocess.CalledProcessError: Command '['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200']' returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/f8a_jobs/handlers/maven_popular_analyses.py", line 154, in _use_maven_index_checker
    timeout=1200)
  File "/usr/lib/python3.4/site-packages/f8a_worker/utils.py", line 252, in get_command_output
    return get_command_output(args, graceful, is_json, **kwargs)
  File "/usr/lib/python3.4/site-packages/f8a_worker/utils.py", line 278, in get_command_output
    raise TaskError("Error during running command %s: %r" % (args, ex.output))
f8a_worker.errors.TaskError: Error during running command ['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200']: ''
ERROR:f8a_jobs.handlers.base:Error during running command ['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200']: ''
Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/f8a_worker/utils.py", line 268, in get_command_output
    out = check_output(args, universal_newlines=True, **kwargs)
  File "/usr/lib64/python3.4/subprocess.py", line 617, in check_output
    raise CalledProcessError(retcode, process.args, output=output)
subprocess.CalledProcessError: Command '['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200']' returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/f8a_jobs/handlers/maven_popular_analyses.py", line 154, in _use_maven_index_checker
    timeout=1200)
  File "/usr/lib/python3.4/site-packages/f8a_worker/utils.py", line 252, in get_command_output
    return get_command_output(args, graceful, is_json, **kwargs)
  File "/usr/lib/python3.4/site-packages/f8a_worker/utils.py", line 278, in get_command_output
    raise TaskError("Error during running command %s: %r" % (args, ex.output))
f8a_worker.errors.TaskError: Error during running command ['java', '-Xmx768m', '-Djava.io.tmpdir=/tmp/tmpb_eev7c8', '-jar', 'maven-index-checker.jar', '-r', '1-200']: ''
2017-11-07 10:11:40,511 - central-index/ deleted
DEBUG:f8a_jobs.handlers.base:central-index/ deleted
ERROR:root:Job 'cd68a7a6-7289-4f73-9e50-43d38de8a2d3' failed, registering ErrorHandler: local variable 'output' referenced before assignment
Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/f8a_jobs/scheduler.py", line 248, in job_execute
    instance.execute(**handler_kwargs)
  File "/usr/lib/python3.4/site-packages/f8a_jobs/handlers/base.py", line 238, in execute
    return self.do_execute(popular)
  File "/usr/lib/python3.4/site-packages/f8a_jobs/handlers/maven_popular_analyses.py", line 207, in do_execute
    self._use_maven_index_checker()
  File "/usr/lib/python3.4/site-packages/f8a_jobs/handlers/maven_popular_analyses.py", line 172, in _use_maven_index_checker
    for idx, release in enumerate(output):
UnboundLocalError: local variable 'output' referenced before assignment
[pid: 1|app: 0|req: 2756/2756] 10.1.2.1 () {30 vars in 380 bytes} [Tue Nov  7 10:12:21 2017] GET /api/v1/readiness => generated 3 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 70 bytes (1 switches on core 0)
2017-11-07 10:12:21,393 - Liveness probe - trying retrieve stored jobs from database using scheduler
WARNING:f8a_jobs.api_v1:Liveness probe - trying retrieve stored jobs from database using scheduler
2017-11-07 10:12:21,393 - Liveness probe - trying to schedule the livenessFlow
WARNING:f8a_jobs.api_v1:Liveness probe - trying to schedule the livenessFlow
2017-11-07 10:12:21,394 - Scheduling Selinon flow 'livenessFlow' with node_args: 'None', job 'None'
DEBUG:f8a_jobs.handlers.base:Scheduling Selinon flow 'livenessFlow' with node_args: 'None', job 'None'
2017-11-07 10:12:21,432 - Liveness probe - finished
WARNING:f8a_jobs.api_v1:Liveness probe - finished
[pid: 1|app: 0|req: 2757/2757] 10.1.2.1 () {30 vars in 378 bytes} [Tue Nov  7 10:12:21 2017] GET /api/v1/liveness => generated 3 bytes in 47 msecs (HTTP/1.1 200) 2 headers in 70 bytes (1 switches on core 0)

Full logs are available here: http://pastebin.test.redhat.com/529944

msrb commented 6 years ago

PV should do the trick here, IMO.

msrb commented 6 years ago

Waiting for https://gitlab.cee.redhat.com/dtsd/housekeeping/issues/852.

msrb commented 6 years ago

Note the same problem is in prod as well.

kbsingh commented 6 years ago

We should be able to close this with the corresponding PVC piece moving forward, @msrb please ping @mmclanerh if you are not closed on this in the coming days.

msrb commented 6 years ago

We have the PV in rh-idev now, I will do some testing before moving the changes to prod.

msrb commented 6 years ago

Fix ready, will be promoted to prod as soon as possible: https://github.com/openshiftio/saas-analytics/pull/181