tfwillems / HipSTR

Genotype and phase short tandem repeats using Illumina whole-genome sequencing data
GNU General Public License v2.0
94 stars 31 forks source link

ERROR: Failed to load the index for file xxx.bam #69

Closed shiyirong closed 4 years ago

shiyirong commented 4 years ago

Hello HipSTR authors, I ran into an error while running HipSTR on a ‘bam list’ containing lots of bam files with specified chromosome and splitted reference region file. I submit the script two more times. Oddly, the bam file that triggers this error is different every time. No errors occurred in some of other jobs with different setting of chromosome and reference region file. I think I have built correct index for every bam file. I am confused for this error and don’t know how to solve it. I wonder how HipSTR load or check the index for bam file.

The call: /home/shiyr/software/HipSTR/HipSTR --fasta /home/work01/WGS/anno/hg38/v0/Homo_sapiens_assembly38.fasta --regions /home/shiyr/software/HipSTR/hg38.hipstr_reference-splitted/hg38.hipstr_reference-chrY-splited/hg38.hipstr_reference-chrY-splited0016 --bam-files /home/shiyr/software/HipSTR/bamlist --chrom chrY --max-haps 100 --str-vcf hg38.hipstr_reference-chrY-splited0016.vcf.gz --output-gls --output-pls --output-filters --output-phased-gls --viz-out hg38.hipstr_reference-chrY-splited0016-viz.gz --log hg38.hipstr_reference-chrY-splited0016_log.txt --stutter-out hg38.hipstr_reference-chrY-splited0016_stutter_models.txt

The error for the 1st time:

ERROR: Failed to load the index for file 1486.bam Exiting...

The error for the 2nd time:

ERROR: Failed to load the index for file 3595.bam Exiting...

The error for the 3th time:

ERROR: Failed to load the index for file 1517.bam Exiting...

Thank you for your time.

tfwillems commented 4 years ago

Hi @shiyirong,

My hunch is that this is an issue you encounter when we approach the maximum file limit.

How many BAM files are in /home/shiyr/software/HipSTR/bamlist? If it's on the order of 1000, does increasing the open file limit solve your issue?

Best, Thomas

shiyirong commented 4 years ago

Hi Thomas,

Thank you for your reply!

There are about 6 thousand BAM files in /home/shiyr/software/HipSTR/bamlist. I want to run HipSTR on all of these BAM files to get an integrative result. How can I increase the open file limit? Should I set additional parameters?

Thank you for your time.

tfwillems commented 4 years ago

Hi @shiyirong,

That number of files will almost certainly cause an open-file-limit issue.

The open file limit is unfortunately enforced by your operating system, not HipSTR. If you're on a linux machine, you can view the soft file limit with ulimit -S -n and the hard file limit with ulimit -H -n. The hard limit is typically set by a system admin, while the soft limit is something you can increase up until the hard limit. You can see more details about the ulimit command here

If your soft open file limit is less than your hard open file limit, you can increase the soft limit all the way up to the hard limit. So you might try something like the following to allow you to open 6K BAM files, 6K BAM file indexes and a bit of padding: ulimit -S -n 13000.

shiyirong commented 4 years ago

Hi Thomas, I'll try the solution you suggested and see what happens. Thank you so much!

tfwillems commented 4 years ago

Hi @shiyirong, Any luck resolving this issue? If so, I'll go ahead and close this issue.

Thanks! Thomas

shiyirong commented 4 years ago

Hi Thomas, I found that this issue is an accidental event caused by the hardware occupation. Thanks for your advice and please close this issue.