OliveiraDS-hub / ChimeraTE

A pipeline to detect chimeric transcripts derived from genes and transposable elements.
GNU General Public License v3.0
18 stars 4 forks source link

samtools index: failed to create index, Numerical result out of range #11

Closed majie5976 closed 9 months ago

majie5976 commented 9 months ago

Hi, Thank you for this fantastic tool. I'm currently using it, but I've come across some errors during the alignment process. Could you please help me on resolving this issue? Thank you. Here's the full error message: [Wednesday 20/9/2023 - 19h:36] Performing alignment [E::hts_idx_check_range] Region 536907104..536921927 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.4436297' with ref_name='Chr01', ref_length=1268505426, flags=163, pos=536907105 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/accepted_hits.bam": Numerical result out of range [E::hts_idx_check_range] Region 536907104..536921927 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.4436297' with ref_name='Chr01', ref_length=1268505426, flags=163, pos=536907105 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/fwd1_f.bam": Numerical result out of range [E::hts_idx_check_range] Region 536907104..536921927 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.4436297' with ref_name='Chr01', ref_length=1268505426, flags=83, pos=536907105 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/fwd2_f.bam": Numerical result out of range [E::hts_idx_check_range] Region 536907104..536921927 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.4436297' with ref_name='Chr01', ref_length=1268505426, flags=163, pos=536907105 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/fwd.bam": Numerical result out of range [E::hts_idx_check_range] Region 537184991..537185089 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.24506950' with ref_name='Chr01', ref_length=1268505426, flags=147, pos=537184992 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/rev1_r.bam": Numerical result out of range [E::hts_idx_check_range] Region 537184984..537185085 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.24506950' with ref_name='Chr01', ref_length=1268505426, flags=99, pos=537184985 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/rev2_r.bam": Numerical result out of range [E::hts_idx_check_range] Region 537184984..537185085 cannot be stored in a bai index. Try using a csi index [E::sam_index] Read 'SRR21863933.24506950' with ref_name='Chr01', ref_length=1268505426, flags=99, pos=537184985 cannot be indexed samtools index: failed to create index for "/home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/rev.bam": Numerical result out of range Done! [Wednesday 20/9/2023 - 19h:49] Genes expression Done! [Wednesday 20/9/2023 - 19h:57] Getting fpkm Done! [Wednesday 20/9/2023 - 19h:58] Strand-specific expression analysis Done! [Wednesday 20/9/2023 - 20h:08] Chimeric reads pairs identification Traceback (most recent call last): File "chimTE_mode1.py", line 152, in alignment_func(out_dir,group,aln_dir,mate1,mate2) File "scripts/mode1_alignment.py", line 186, in alignment_func chimeric_TEs = expressed_TEs.intersect(chim_reads, wa=True, nonamecheck=True, F=str(args.overlap)) File "/home/keyilong/miniconda3/envs/chimeraTE/lib/python3.6/site-packages/pybedtools/bedtool.py", line 923, in decorated result = method(self, *args, **kwargs) File "/home/keyilong/miniconda3/envs/chimeraTE/lib/python3.6/site-packages/pybedtools/bedtool.py", line 407, in wrapped decode_output=decode_output, File "/home/keyilong/miniconda3/envs/chimeraTE/lib/python3.6/site-packages/pybedtools/helpers.py", line 460, in call_bedtools raise BEDToolsError(subprocess.list2cmdline(cmds), stderr) pybedtools.helpers.BEDToolsError: Command was:

bedtools intersect -a /tmp/pybedtools.ewzi1yrv.tmp -wa -nonamecheck -b /home/keyilong/crickets/RNA_seq/ChimeraTE-ChimeraTE-v1.1.1/projects/mode1_output/AntF/alignment/chim_reads_coord.bed -F 0.5

Error message was: Error: Unable to open file /tmp/pybedtools.ewzi1yrv.tmp. Exiting.

OliveiraDS-hub commented 9 months ago

Dear Muhammad, I'm glad that ChimeraTE will be useful for your research.

This error message: [E::hts_idx_check_range] Region 536907104..536921927 cannot be stored in a bai index. Try using a csi index happens because you have chromosomes bigger than 512 Mb. Therefore, index fails just after.

Adding the parameter -c in the samtools index command would solve your issue, but I don't know the downstream consequences that it might have in the pipeline.

I have to test it before tell you what is going on.

Is the genome that you are using publicly available? I think your issue will provide a good enhancement to the pipeline. More people can face the same error message.

Best

OliveiraDS-hub commented 9 months ago

Dear @majie5976

Can you confirm me that you are using ChimeraTE with the conda env?

I've updated the script that performs the alignment step.

Could you please download ChimeraTE again and test it?

The index error is supposed to be fixed, but I'm not sure about the last error.

I hope to hear from you soon.

Best

majie5976 commented 9 months ago

Dear @OliveiraDS-hub, My apologies for the delayed response, and thank you for your prompt reply. I am using ChimeraTE with the conda environment. Should I reinstall it through conda env, or consider using the source code? Thanks, Majie

OliveiraDS-hub commented 9 months ago

Dear Muhammad,

You can keep the conda env, nothing has changed on it.

Just download chimeraTE repo in your machine again, and rerun your analysis.

And please, can you also confirm whether the example data is working properly with the expected results?

Thank you

majie5976 commented 9 months ago

Dear @OliveiraDS-hub, I've downloaded the repository and am currently running ChimeraTE mode1 with my data. The example data worked perfectly. I will let you know whether I succeed or fail again. Thanks!

majie5976 commented 9 months ago

Dear @OliveiraDS-hub , It worked without any error. Many thanks!