mhammell-laboratory / TEtranscripts

A package for including transposable elements in differential enrichment analysis of sequencing datasets.
http://hammelllab.labsites.cshl.edu/software/#TEtranscripts
GNU General Public License v3.0
228 stars 30 forks source link

[E::idx_find_and_load] Could not retrieve index file for '.1712579023.8525763.bam' #185

Closed ptranvan closed 7 months ago

ptranvan commented 7 months ago

Hi,

Hi have this issue:

[E::idx_find_and_load] Could not retrieve index file for '.1712579023.8525763.bam' My command line is :

singularity exec tetranscripts.sif TEtranscripts -t 1.bam 2.bam -c 3.bam 4.bam --GTF Homo_sapiens-GCA_009914755.4-2022_07-genes.gtf --TE T2T-CHM13v2_rmsk_TE.nochr.centromere.gtf --stranded reverse --sortByPos --mode multi

And I set 100GB of memory.

any solutions ?

olivertam commented 7 months ago

Hi,

Thank you for your interest in the software. Do you have TEtranscripts_out.cntTable and/or TEtranscripts_out_sigdiff_gene_TE.txt as your output? If so, the run should have completed successfully. If what you posted is the only warning message, then there is no impact on your run (See also #82). This is an known issue with htslib throwing this warning despite not requiring the index. We hope to suppress this warning in future TEtranscripts releases.

Thanks.

ptranvan commented 7 months ago

Thanks for your prompt answer. Now I have an other issue do you know how to fix it ?


INFO  @ Tue, 09 Apr 2024 15:59:48: Done building gene index ...... 

INFO  @ Tue, 09 Apr 2024 15:59:49: Building TE index ....... 

INFO  @ Tue, 09 Apr 2024 15:59:54: Done building TE index ...... 

INFO  @ Tue, 09 Apr 2024 15:59:54: 
Reading sample files ... 

[E::bgzf_flush] File write failed (wrong size)
[E::bgzf_close] File write failed
Error occurred when reading first line of sample file 1.bam. 
Error: 'samtools returned with error 1: stdout=, stderr=[bam_sort_core] merging from 46 files and 1 in-memory blocks...\nsamtools sort: failed writing to ".1712671194.444845.bam": No space left on device\n' 
[Exception type: SamtoolsError, raised in utils.py:69] 
olivertam commented 7 months ago

Hi,

Based on the message, it looks like you ran out of disk space:

samtools sort: failed writing to ".1712671194.444845.bam": No space left on device

Because you're using the --sortByPos flag, TEtranscripts need to re-sort the file based on read name, and thus would need disk space for it.

Thanks

ptranvan commented 7 months ago

That's strange because I have plenty of space.

I have ve set the --outdir parameter and for singularity the -W in the singularity exec command. Is there any default directory when I run the command I am not aware of ?

olivertam commented 7 months ago

Hi,

I'm afraid I have limited experience with Singularity. Are you running the singularity module in a different disk mount to your -W location (e.g. you're running Singularity on $HOME, but your data mount is elsewhere? I'm also assuming you're using -W (workdir) and not -w (writable) by accident. You can try this and see if the container is full (for some reason):

singularity exec -W [your workdir] tetranscripts.sif df

If so, maybe re-pull the container (though I don't know why it would fill up unless it was made writable) Sorry if this does not help.

Thanks

ptranvan commented 7 months ago

I have solved the issue.

I re-sorted my .bam manually using samtools sort -n

and used TEtranscripts without --sortByPos

xiatianjihao commented 2 weeks ago

I have solved the issue.

I re-sorted my .bam manually using samtools sort -n

and used TEtranscripts without --sortByPos

Hello,

--sortByPos Hello,

Why do I still get an error saying the index file cannot be found, even though I first used samtools sort -n to process my BAM file and didn't use --sortByPos? samtools sort -n itself also doesn't generate an index file.

Thanks,

Xiatian

olivertam commented 2 weeks ago

Hi,

As mentioned above, this is an erroneous warning message from pysam that has no impact on the running of the software. You can ignore it

Thanks

xiatianjihao commented 1 week ago

Hi,

As mentioned above, this is an erroneous warning message from pysam that has no impact on the running of the software. You can ignore it

Thanks

Hi,

But my pipeline just exited directly without continuing, and there were no result files at all. QQ截图20241109212546

Thanks,

Xiaojuan

olivertam commented 1 week ago

Hi,

Is your library paired-end? If so, this is likely an error due to the presence of discordant or unpaired alignments in your BAM file. STAR typically do not output those, but if you're using something like HISAT2, you would need to set --no-mixed and --no-discordant parameters while aligning.

Let us know if that doesn't resolve the issue.

Thanks.

xiatianjihao commented 1 week ago

Hi,

Is your library paired-end? If so, this is likely an error due to the presence of discordant or unpaired alignments in your BAM file. STAR typically do not output those, but if you're using something like HISAT2, you would need to set --no-mixed and --no-discordant parameters while aligning.

Let us know if that doesn't resolve the issue.

Thanks.

Hi,

After I remapped my reads using Bowtie2, one of the samples processed successfully, but another one gave me an error. Could you help me check what might be the reason? Thank you very much!

QQ截图20241114153031

Thanks,

Xiaojuan

olivertam commented 1 week ago

Hi,

It looks like you're removing duplicates in your BAM file. This essentially removes all multi-mappers, and thus it negates the benefits of TEtranscripts to analyze repetitive sequences. It is recommended that you don't remove duplicates if you want to quantify TE, especially those with high similarity to consensus (and thus more likely to be identical across loci).

Thanks

xiatianjihao commented 1 week ago

Hi, Is your library paired-end? If so, this is likely an error due to the presence of discordant or unpaired alignments in your BAM file. STAR typically do not output those, but if you're using something like HISAT2, you would need to set --no-mixed and --no-discordant parameters while aligning. Let us know if that doesn't resolve the issue. Thanks.

Hi,

After I remapped my reads using Bowtie2, one of the samples processed successfully, but another one gave me an error. Could you help me check what might be the reason? Thank you very much!

QQ截图20241114153031

Thanks,

Xiaojuan

Sorry, I got the where is the problem, I just used wrong file!

Best,

Xiaojuan

xiatianjihao commented 1 week ago

Hi,

It looks like you're removing duplicates in your BAM file. This essentially removes all multi-mappers, and thus it negates the benefits of TEtranscripts to analyze repetitive sequences. It is recommended that you don't remove duplicates if you want to quantify TE, especially those with high similarity to consensus (and thus more likely to be identical across loci).

Thanks

Thank you so much!