Closed jungch closed 3 years ago
Log files of apparently successful downloads which are actually truncated: slurm-1351075.log slurm-1350121.log
Hi @jungch and @JocelynSP - Thank you for reporting this issue. We will discuss internally the suggestion you raised, and we might also reach out separately to get some more details of what you've tried so we can try to reproduce what you've seen.
Hi @jungch and @JocelynSP. Unfortunately we have been unable to reproduce this issue. I have followed up via email with a suggested next step.
Hi @jungch and @JocelynSP. A quick update: We have been working on a reimplementation of the htsget protocol, which we hope will generally improve the fetching of genomic ranges. In the meantime, Daniel should be in touch to get you the files you need, from the list you shared via email previously. I will close this issue as this appears to be specific to the files you are requesting. Thanks!
incomplete read extraction within a genomic range (Error with
pyega3 fetch
)Description of the bug
Tried to extract reads mapped to mitochondria genome using command below: $ pyega3 -cf credential.json fetch --reference-name MT mt_reads.bam
Sometimes, the downloaded BAM file contains only a subset of the reads of what it was supposed to contain, despite no error messages from the pyega3 run. The log file from the pyega3 run that yielded incomplete set of reads practically did not say anything.
Later, when we tried the read extraction locally using 'samtools' (version 1.9), the samtools command sometimes crashed with 'Segmentation faults' when using multiple cores. This crashing seemed to happen randomly. However, when using just a single core (which is, I believe, the default setting), the crashing issue didn't happen. Also, the latest samtools (version 1.12) doesn't seems to have this crashing issue with multiple cores.
So, I wonder the incomplete read-extraction by 'pyega3 fetch' is somehow related to the samtools issue?