wodanaz / Assembling_viruses

0 stars 0 forks source link

issues with picard #43

Open wodanaz opened 3 years ago

wodanaz commented 3 years ago

Hi John, hope you are well.

I am not sure why markduplicates is not working well anymore. I thought it was a memory allocation issue but not. I tried allocating more memory and still have the same issue. I also tried to update

Here is the output of one of the errors.

OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory file:
   29381
Try using the -Djava.io.tmpdir= option to select an alternate temp location.

INFO    2021-04-20 22:36:58 MarkDuplicates  

********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
**********    MarkDuplicates -I /gpfs/fs1/data/wraycompute/alejo/CoV_data/Hosp_April_20_data/tmp.FN0HfDEacJ/Duke-CMB-101M604-00000832-LC-10A.bam -O /gpfs/fs1/data/wraycompute/alejo/CoV_data/Hosp_April_20_data/tmp.FN0HfDEacJ/Duke-CMB-101M604-00000832-LC-10A.dedup.bam -M /gpfs/fs1/data/wraycompute/alejo/CoV_data/Hosp_April_20_data/tmp.FN0HfDEacJ/Duke-CMB-101M604-00000832-LC-10A.metric.txt
**********

22:36:58.948 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/fs1/data/wraycompute/alejo/conda_envs/escapevariants/share/picard-2.25.0-1/picard.jar!/com/intel/gkl/native/libgkl_compression.so
22:36:58.959 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No space left on device)
22:36:58.960 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/fs1/data/wraycompute/alejo/conda_envs/escapevariants/share/picard-2.25.0-1/picard.jar!/com/intel/gkl/native/libgkl_compression.so
22:36:58.960 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No space left on device)
[Tue Apr 20 22:36:58 EDT 2021] MarkDuplicates INPUT=[/gpfs/fs1/data/wraycompute/alejo/CoV_data/Hosp_April_20_data/tmp.FN0HfDEacJ/Duke-CMB-101M604-00000832-LC-10A.bam] OUTPUT=/gpfs/fs1/data/wraycompute/alejo/CoV_data/Hosp_April_20_data/tmp.FN0HfDEacJ/Duke-CMB-101M604-00000832-LC-10A.dedup.bam METRICS_FILE=/gpfs/fs1/data/wraycompute/alejo/CoV_data/Hosp_April_20_data/tmp.FN0HfDEacJ/Duke-CMB-101M604-00000832-LC-10A.metric.txt    MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Tue Apr 20 22:36:58 EDT 2021] Executing as ab620@x2-05-3.genome.duke.edu on Linux 2.6.32-642.13.1.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_282-b08; Deflater: Jdk; Inflater: Jdk; Provider GCS is not available; Picard version: 2.25.0
INFO    2021-04-20 22:36:59 MarkDuplicates  Start of doWork freeMemory: 2037383104; totalMemory: 2058354688; maxMemory: 13362528256
INFO    2021-04-20 22:36:59 MarkDuplicates  Reading input file and constructing read end information.
INFO    2021-04-20 22:36:59 MarkDuplicates  Will retain up to 48414957 data points before spilling to disk.
22:36:59.213 WARN  IntelDeflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
[Tue Apr 20 22:36:59 EDT 2021] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2058354688
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Exception creating temporary directory.
    at htsjdk.samtools.util.IOUtil.createTempDir(IOUtil.java:996)
    at htsjdk.samtools.CoordinateSortedPairInfoMap.<init>(CoordinateSortedPairInfoMap.java:59)
    at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.<init>(DiskBasedReadEndsForMarkDuplicatesMap.java:57)
    at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:511)
    at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:257)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Caused by: java.io.IOException: No space left on device
    at java.io.UnixFileSystem.createFileExclusively(Native Method)
    at java.io.File.createTempFile(File.java:2026)
    at java.io.File.createTempFile(File.java:2072)
    at htsjdk.samtools.util.IOUtil.createTempDir(IOUtil.java:987)
    ... 7 more
johnbradley commented 3 years ago

The following error is because someone has exhausted disk space on the node:

Caused by: java.io.IOException: No space left on device

In this case I think it's the /tmp directory has filled up on x2-05-3.genome.duke.edu and your program cannot continue. Send an email to gcb-help@duke.edu so the sysadmins can take a look at this node for what used up all the space.

wodanaz commented 3 years ago

that makes sense.

Thanks, I just put a ticket

Thank you!

johnbradley commented 3 years ago

If you need to get going immediately you could try avoiding this messed up node. There is a --exclude=<node name list> flag for sbatch, but I haven't used it. You could try adding --exclude=x2-05-3 to the sbatch invocations. For example to add the flag to all the array jobs we run you could add it to the following line:

https://github.com/wodanaz/Assembling_viruses/blob/0bee368c70e06dae8a1075384ec90a85d8a8fab3/scripts/sbatch-array.sh#L27

Something like:

sbatch --exclude=x2-05-3 --wait --array=1...
wodanaz commented 3 years ago

I am trying this now. I will let you know. I have not use it before either

wodanaz commented 3 years ago

I think I solved for picard and for gatk. There was an issue with gatk as well and solved by adding "-Djava.io.tmpdir=/data/covid19lab/tmp" to the lines with gatk

johnbradley commented 3 years ago

Hard coding the tmp directory like this will hurt portability of the pipeline. We already have a temporary directory that is cleaned up by the pipeline: $EVDIR. @wodanaz Are you good with me changing the scripts to store the tmp files within $EVDIR?

wodanaz commented 3 years ago

That is true, Absolutely. I am okay with that. $EVDIR is actually simpler.

wodanaz commented 3 years ago

Also, where would be the best place to put the code for the account '-A covid19lab' ?

johnbradley commented 3 years ago

There is an environment variable SBATCH_ACCOUNT that can be exported to apply this setting to all sbatch calls. Right now you could turn this on by running the following in your HARDAC terminal session:

export SBATCH_ACCOUNT=covid19lab

Then within that terminal session when you run-escape-variants.sh or run-dds-escape-variants.sh the account will be used.

Since this export step is really easy to forget I would like to make a change to put the account name in a config file. That way you could set it up once. How does that sound @wodanaz?

wodanaz commented 3 years ago

It sounds like a great plan. I am afraid I will start forgetting to switch if I need to use wraycompute or mcclaylab accounts