BIMSBbioinfo / pigx_scrnaseq

Pipeline for analysis of Dropseq single cell data
http://bioinformatics.mdc-berlin.de/pigx
10 stars 6 forks source link

Java fails with old kernel. #25

Closed Blosberg closed 6 years ago

Blosberg commented 6 years ago

Error discovered yesterday: something about how Java builds a virtual machine but can't allocate a stack of a certain size. If the memory values are too low then we just run out of memory. If it's too high then we hit this error.


    jobid: 0
    output: /home/bosberg/projects/pigx_scrnaseq_masterbak/tests/out/Mapped/WT_HEK_4h_br1/WT_HEK_4h_br1.fastq.bam
    log: /home/bosberg/projects/pigx_scrnaseq_masterbak/tests/out/Log/WT_HEK_4h_br1.merge_fastq_to_bam.log```

Log file contents:

```$ tail  /home/bosberg/projects/pigx_scrnaseq_masterbak/tests/out/Log/WT_HEK_4h_br1.merge_fastq_to_bam.log
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VariantAccumulatorExecutor$MultiThreadedChunkBased$MultiException$1java.lang.NoClassDefFoundError: com/google/common/base/Function
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VariantIteratorProducer$Threadsafe$3java.lang.NoClassDefFoundError: com/google/common/base/Function
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VariantIteratorProducer$Threadsafe$NonUniqueVariantPredicatejava.lang.NoClassDefFoundError: com/google/common/base/Predicate
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VariantIteratorProducer$Threadsafe$OverlapsPredicatejava.lang.NoClassDefFoundError: com/google/common/base/Predicate
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VcfFileSegmentGenerator$1$1java.lang.NoClassDefFoundError: com/google/common/base/Predicate
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VcfFileSegmentGenerator$ByWholeContig$1java.lang.NoClassDefFoundError: com/google/common/base/Function
DEBUG    2018-03-22 17:56:54    ClassFinder    could not load class: picard.vcf.processor.VcfFileSegmentGenerator$WidthLimitingDecorator$1java.lang.NoClassDefFoundError: com/google/common/base/Function
[Thu Mar 22 17:56:54 CET 2018] picard.sam.FastqToSam FASTQ=/home/bosberg/projects/pigx_scrnaseq_masterbak/tests/sample_data/reads/HEK_4h_br1_R1.fastq.gz FASTQ2=/home/bosberg/projects/pigx_scrnaseq_masterbak/tests/sample_data/reads/HEK_4h_br1_R2.fastq.gz QUALITY_FORMAT=Standard OUTPUT=/home/bosberg/projects/pigx_scrnaseq_masterbak/tests/out/Mapped/WT_HEK_4h_br1/WT_HEK_4h_br1.fastq.bam SAMPLE_NAME=WT_HEK_4h_br1 SORT_ORDER=queryname    USE_SEQUENTIAL_FASTQS=false READ_GROUP_NAME=A MIN_Q=0 MAX_Q=93 STRIP_UNPAIRED_MATE_NUMBER=false ALLOW_AND_IGNORE_EMPTY_LINES=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json```

I seems that all of the rules the run with Java have this problem, and the latest efforst at diagnosis have indicated that it is a problem related to the old kernel on the max cluster. 
rekado commented 6 years ago

Unfortunately, this is not a problem with the pipelines, but with the interaction between glibc 2.26 and the old kernel on RHEL6 systems. glibc 2.26 assumes that kernels have an implementation of the syscall prlimit64, which is correct for all kernels that glibc 2.26 supports (version 3.4 and up). The RHEL6 kernel is 2.6.32 with a bunch of patches to backport features and fixes from later kernels. Sadly, the patches don't include an implementation for the prlimit64 syscall.

As a result, software built with glibc 2.26 won't be usable if it uses getrlimits, which is implemented in glibc 2.26 using prlimit64. Going forward, we will patch glibc to provide an alternative implementation for older kernels that doesn't rely on prlimit64.

For the time being I have reverted glibc to version 2.25 on branch rhel6 in the Guix repository, which works with old kernels down to the vanilla 2.6.32. We are rebuilding software with that glibc version, so that Java will work again even on clusters running close-to-end-of-life RHEL6.

rekado commented 6 years ago

This is fixed in the core-updates and rhel6 branches. core-updates will soon be merged into master.