Closed Linran2023 closed 1 year ago
Hi,
Thank you for your interest in the software. I have a few questions for you: 1) How big is your BAM file? If it's huge, it might be possible that it's still processing it through (though it is certainly longer than expected). 2) Is it still running? I'm wondering if it is hanging or it exited. 3) How much resources are you requesting? 4) Is your library unstranded? That's currently the mode that it's on, and I wanted to make sure that it's correct (especially if it takes so long to run).
Thanks.
Hi Oliver,
Thank you so much for your prompt reply!
My answers:
My BAM file is 4.7G
It exited and is not running anymore.
This is the resources I asked for:
hard resource_list: h_vmem=16G,tmem=16G
This is the usage in the outputfile:
usage 1: cpu=01:00:12, mem=13261.54936 GB s, io=29.34027 GB, vmem=4.637G, maxvmem=5.363G
These are the first several lines of the STAR ReadsPerGene.out.tab file of this sample. I think this is unstranded.
ENSG00000274630 0 522 0
ENSG00000204815 22 10 12
ENSG00000173786 2991 748 2970
ENSG00000168259 1994 70 2878
ENSG00000168256 605 241 613
ENSG00000187595 14 29 7
ENSG00000267221 21 0 21
ENSG00000108771 525 5 520
Thanks!
Hi.
Thanks for your response. It does appear that it shouldn't be a problem. Could you provide the header to your BAM file? You should be able to get it by doing the following:
$ samtools view -H Mock1Aligned.sortedByCoord.out.bam
Are there more information from the various log file from the TEcount run? I noticed the ......
at the end of your log, and I just wanted to make sure that I got all the error messages.
If that doesn't resolve the issue, I might have to ask for a sample of your BAM file (or its entirety) to see if I can reproduce the error.
Thanks.
Hi Oliver,
Thanks!
This is the header of BAM file:
$ samtools view -H Mock1Aligned.sortedByCoord.out.bam
@HD VN:1.4 SO:coordinate
@SQ SN:1 LN:248956422
@SQ SN:10 LN:133797422
@SQ SN:11 LN:135086622
@SQ SN:12 LN:133275309
@SQ SN:13 LN:114364328
@SQ SN:14 LN:107043718
@SQ SN:15 LN:101991189
@SQ SN:16 LN:90338345
@SQ SN:17 LN:83257441
@SQ SN:18 LN:80373285
@SQ SN:19 LN:58617616
@SQ SN:2 LN:242193529
@SQ SN:20 LN:64444167
@SQ SN:21 LN:46709983
@SQ SN:22 LN:50818468
@SQ SN:3 LN:198295559
@SQ SN:4 LN:190214555
@SQ SN:5 LN:181538259
@SQ SN:6 LN:170805979
@SQ SN:7 LN:159345973
@SQ SN:8 LN:145138636
@SQ SN:9 LN:138394717
@SQ SN:MT LN:16569
@SQ SN:X LN:156040895
@SQ SN:Y LN:57227415
@SQ SN:KI270728.1 LN:1872759
@SQ SN:KI270727.1 LN:448248
@SQ SN:KI270442.1 LN:392061
@SQ SN:KI270729.1 LN:280839
@SQ SN:GL000225.1 LN:211173
@SQ SN:KI270743.1 LN:210658
@SQ SN:GL000008.2 LN:209709
@SQ SN:GL000009.2 LN:201709
@SQ SN:KI270747.1 LN:198735
@SQ SN:KI270722.1 LN:194050
@SQ SN:GL000194.1 LN:191469
@SQ SN:KI270742.1 LN:186739
@SQ SN:GL000205.2 LN:185591
@SQ SN:GL000195.1 LN:182896
@SQ SN:KI270736.1 LN:181920
@SQ SN:KI270733.1 LN:179772
@SQ SN:GL000224.1 LN:179693
@SQ SN:GL000219.1 LN:179198
@SQ SN:KI270719.1 LN:176845
@SQ SN:GL000216.2 LN:176608
@SQ SN:KI270712.1 LN:176043
@SQ SN:KI270706.1 LN:175055
@SQ SN:KI270725.1 LN:172810
@SQ SN:KI270744.1 LN:168472
@SQ SN:KI270734.1 LN:165050
@SQ SN:GL000213.1 LN:164239
@SQ SN:GL000220.1 LN:161802
@SQ SN:KI270715.1 LN:161471
@SQ SN:GL000218.1 LN:161147
@SQ SN:KI270749.1 LN:158759
@SQ SN:KI270741.1 LN:157432
@SQ SN:GL000221.1 LN:155397
@SQ SN:KI270716.1 LN:153799
@SQ SN:KI270731.1 LN:150754
@SQ SN:KI270751.1 LN:150742
@SQ SN:KI270750.1 LN:148850
@SQ SN:KI270519.1 LN:138126
@SQ SN:GL000214.1 LN:137718
@SQ SN:KI270708.1 LN:127682
@SQ SN:KI270730.1 LN:112551
@SQ SN:KI270438.1 LN:112505
@SQ SN:KI270737.1 LN:103838
@SQ SN:KI270721.1 LN:100316
@SQ SN:KI270738.1 LN:99375
@SQ SN:KI270748.1 LN:93321
@SQ SN:KI270435.1 LN:92983
@SQ SN:GL000208.1 LN:92689
@SQ SN:KI270538.1 LN:91309
@SQ SN:KI270756.1 LN:79590
@SQ SN:KI270739.1 LN:73985
@SQ SN:KI270757.1 LN:71251
@SQ SN:KI270709.1 LN:66860
@SQ SN:KI270746.1 LN:66486
@SQ SN:KI270753.1 LN:62944
@SQ SN:KI270589.1 LN:44474
@SQ SN:KI270726.1 LN:43739
@SQ SN:KI270735.1 LN:42811
@SQ SN:KI270711.1 LN:42210
@SQ SN:KI270745.1 LN:41891
@SQ SN:KI270714.1 LN:41717
@SQ SN:KI270732.1 LN:41543
@SQ SN:KI270713.1 LN:40745
@SQ SN:KI270754.1 LN:40191
@SQ SN:KI270710.1 LN:40176
@SQ SN:KI270717.1 LN:40062
@SQ SN:KI270724.1 LN:39555
@SQ SN:KI270720.1 LN:39050
@SQ SN:KI270723.1 LN:38115
@SQ SN:KI270718.1 LN:38054
@SQ SN:KI270317.1 LN:37690
@SQ SN:KI270740.1 LN:37240
@SQ SN:KI270755.1 LN:36723
@SQ SN:KI270707.1 LN:32032
@SQ SN:KI270579.1 LN:31033
@SQ SN:KI270752.1 LN:27745
@SQ SN:KI270512.1 LN:22689
@SQ SN:KI270322.1 LN:21476
@SQ SN:GL000226.1 LN:15008
@SQ SN:KI270311.1 LN:12399
@SQ SN:KI270366.1 LN:8320
@SQ SN:KI270511.1 LN:8127
@SQ SN:KI270448.1 LN:7992
@SQ SN:KI270521.1 LN:7642
@SQ SN:KI270581.1 LN:7046
@SQ SN:KI270582.1 LN:6504
@SQ SN:KI270515.1 LN:6361
@SQ SN:KI270588.1 LN:6158
@SQ SN:KI270591.1 LN:5796
@SQ SN:KI270522.1 LN:5674
@SQ SN:KI270507.1 LN:5353
@SQ SN:KI270590.1 LN:4685
@SQ SN:KI270584.1 LN:4513
@SQ SN:KI270320.1 LN:4416
@SQ SN:KI270382.1 LN:4215
@SQ SN:KI270468.1 LN:4055
@SQ SN:KI270467.1 LN:3920
@SQ SN:KI270362.1 LN:3530
@SQ SN:KI270517.1 LN:3253
@SQ SN:KI270593.1 LN:3041
@SQ SN:KI270528.1 LN:2983
@SQ SN:KI270587.1 LN:2969
@SQ SN:KI270364.1 LN:2855
@SQ SN:KI270371.1 LN:2805
@SQ SN:KI270333.1 LN:2699
@SQ SN:KI270374.1 LN:2656
@SQ SN:KI270411.1 LN:2646
@SQ SN:KI270414.1 LN:2489
@SQ SN:KI270510.1 LN:2415
@SQ SN:KI270390.1 LN:2387
@SQ SN:KI270375.1 LN:2378
@SQ SN:KI270420.1 LN:2321
@SQ SN:KI270509.1 LN:2318
@SQ SN:KI270315.1 LN:2276
@SQ SN:KI270302.1 LN:2274
@SQ SN:KI270518.1 LN:2186
@SQ SN:KI270530.1 LN:2168
@SQ SN:KI270304.1 LN:2165
@SQ SN:KI270418.1 LN:2145
@SQ SN:KI270424.1 LN:2140
@SQ SN:KI270417.1 LN:2043
@SQ SN:KI270508.1 LN:1951
@SQ SN:KI270303.1 LN:1942
@SQ SN:KI270381.1 LN:1930
@SQ SN:KI270529.1 LN:1899
@SQ SN:KI270425.1 LN:1884
@SQ SN:KI270396.1 LN:1880
@SQ SN:KI270363.1 LN:1803
@SQ SN:KI270386.1 LN:1788
@SQ SN:KI270465.1 LN:1774
@SQ SN:KI270383.1 LN:1750
@SQ SN:KI270384.1 LN:1658
@SQ SN:KI270330.1 LN:1652
@SQ SN:KI270372.1 LN:1650
@SQ SN:KI270548.1 LN:1599
@SQ SN:KI270580.1 LN:1553
@SQ SN:KI270387.1 LN:1537
@SQ SN:KI270391.1 LN:1484
@SQ SN:KI270305.1 LN:1472
@SQ SN:KI270373.1 LN:1451
@SQ SN:KI270422.1 LN:1445
@SQ SN:KI270316.1 LN:1444
@SQ SN:KI270340.1 LN:1428
@SQ SN:KI270338.1 LN:1428
@SQ SN:KI270583.1 LN:1400
@SQ SN:KI270334.1 LN:1368
@SQ SN:KI270429.1 LN:1361
@SQ SN:KI270393.1 LN:1308
@SQ SN:KI270516.1 LN:1300
@SQ SN:KI270389.1 LN:1298
@SQ SN:KI270466.1 LN:1233
@SQ SN:KI270388.1 LN:1216
@SQ SN:KI270544.1 LN:1202
@SQ SN:KI270310.1 LN:1201
@SQ SN:KI270412.1 LN:1179
@SQ SN:KI270395.1 LN:1143
@SQ SN:KI270376.1 LN:1136
@SQ SN:KI270337.1 LN:1121
@SQ SN:KI270335.1 LN:1048
@SQ SN:KI270378.1 LN:1048
@SQ SN:KI270379.1 LN:1045
@SQ SN:KI270329.1 LN:1040
@SQ SN:KI270419.1 LN:1029
@SQ SN:KI270336.1 LN:1026
@SQ SN:KI270312.1 LN:998
@SQ SN:KI270539.1 LN:993
@SQ SN:KI270385.1 LN:990
@SQ SN:KI270423.1 LN:981
@SQ SN:KI270392.1 LN:971
@SQ SN:KI270394.1 LN:970
@PG ID:STAR PN:STAR VN:2.7.3a CL:STAR --runThreadN 1 --genomeDir /home/regmjag/Genomes/HumanSTAR/index --readFilesIn /SAN/breuerlab/pathseq1/AGuerra_myriadDownTemp/Mock1_S1_R1_001.fastq.gz /SAN/breuerlab/pathseq1/AGuerra_myriadDownTemp/Mock1_S1_R2_001.fastq.gz --readFilesCommand "gzip -dcf" --outFileNamePrefix Mock1 --outSAMtype BAM SortedByCoordinate --outFilterMultimapNmax 100 --winAnchorMultimapNmax 100 --quantMode GeneCounts
@CO user command line: STAR --runThreadN 1 --genomeDir /home/regmjag/Genomes/HumanSTAR/index --readFilesIn /SAN/breuerlab/pathseq1/AGuerra_myriadDownTemp/Mock1_S1_R1_001.fastq.gz /SAN/breuerlab/pathseq1/AGuerra_myriadDownTemp/Mock1_S1_R2_001.fastq.gz --readFilesCommand "gzip -dcf" --quantMode GeneCounts --outFileNamePrefix Mock1 --outFilterMultimapNmax 100 --winAnchorMultimapNmax 100 --outSAMtype BAM SortedByCoordinate
[E::idx_find_and_load] Could not retrieve index file for '.1677762396.6834505.bam'
==============================================================
job_number: 9835568
exec_file: job_scripts/9835568
submission_time: Thu Mar 2 12:45:25 2023
owner: linrwang
uid: 12154
group: cs_external
gid: 14800
sge_o_home: /home/linrwang
sge_o_log_name: linrwang
sge_o_path: /usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/pdsh/bin:/opt/rocks/bin:/opt/rocks/sbin:/opt/gridengine/bin/lx-amd64:/home/linrwang/.local/bin:/home/linrwang/bin
sge_o_shell: /bin/bash
sge_o_workdir: /home/linrwang/infectiondata/ref
sge_o_host: pchuckle
account: sge
cwd: /home/linrwang/infectiondata/ref
merge: y
hard resource_list: h_vmem=16G,tmem=16G
mail_list: linrwang@pchuckle.local
notify: FALSE
job_name: TEMOCK1
jobshare: 0
shell_list: NONE:/bin/bash
env_list: TERM=NONE
script_file: TETranscript1.sh
project: external
binding: NONE
job_type: NONE
usage 1: cpu=00:53:58, mem=11570.91627 GB s, io=29.28684 GB, vmem=4.631G, maxvmem=5.363G
binding 1: NONE
scheduling info: (Collecting of scheduler job information is turned off)
I'm not sure if these help. And I really appreciate your willingness to help reproduce the error! I'm also going to try it in another HPC environment now, and I'll let you know if I solve it!
Many thanks, Linran ;)
Hi Linran,
I don't see any obvious issue from the various logs and output. If you're willing to provide either a sample of (or the entire) BAM file, I'll be happy to try and run it on my setup to see if it's reproducible. About 10,000,000 lines should be sufficient for testing purposes.
Thanks.
Hi Oliver,
Sorry for the late reply. I still cannot sovle this problem.
Here is the bam file link. I am really grateful if you can help me to reproduce this. https://drive.google.com/file/d/1zZGbX7vB1md5YkJUvEOBks2kKFzxEM0R/view?usp=sharing
Many thanks, Linran
From: Oliver Tam @.> Sent: 07 March 2023 17:46 To: mhammell-laboratory/TEtranscripts @.> Cc: Wang, Linran @.>; Author @.> Subject: Re: [mhammell-laboratory/TEtranscripts] TEcount was killed after receiving the error: "Could not retrieve index file for '.1677762396.6834505.bam" (Issue #134)
⚠ Caution: External sender
Hi Linran,
I don't see any obvious issue from the various logs and output. If you're willing to provide either a sample of (or the entire) BAM file, I'll be happy to try and run it on my setup to see if it's reproducible. About 10,000,000 lines should be sufficient for testing purposes.
Thanks.
— Reply to this email directly, view it on GitHubhttps://github.com/mhammell-laboratory/TEtranscripts/issues/134#issuecomment-1458581416, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6KRNQAUNLA6G2TQ6T3QMTTW25X65ANCNFSM6AAAAAAVSVNICI. You are receiving this because you authored the thread.Message ID: @.***>
Hi Linran,
Thanks for providing the file. I'll take a look at it and get back to you.
Thanks.
Hi Linran,
I was able to run TEcount on your BAM files and the GTF without issue (ran for 4 hrs). I am attaching the output file here.
I wonder if this is something related to the use of Singularity. I did notice that there was a short spike of memory usage up to 70Gb, but really used only 6-7Gb for most of the run.
Is the TETranscript1.sh
that was referenced in the log file just the Singularity command?
Thanks.
Hi Oliver,
Thank you! Yes I think this may due to the use of Singularity or cluster.
Here is my TETranscript1.sh:
basedirectory=~/infectiondata/ref
cd $basedirectory
$singularity exec ~/tetranscripts.sif TEcount -b ~/infectiondata/alignout/Mock1Aligned.sortedByCoord.out.bam --GTF ~/infectiondata/ref/Homo_sapiens.GRCh38.109.gtf --TE ~/infectiondata/ref/GRCh38_Ensembl_rmsk_TE.gtf --sortByPos
Many thanks, Linran
From: Oliver Tam @.> Sent: 11 March 2023 2:05 To: mhammell-laboratory/TEtranscripts @.> Cc: Wang, Linran @.>; Author @.> Subject: Re: [mhammell-laboratory/TEtranscripts] TEcount was killed after receiving the error: "Could not retrieve index file for '.1677762396.6834505.bam" (Issue #134)
⚠ Caution: External sender
Hi Linran,
I was able to run TEcount on your BAM files and the GTF without issue. I am attaching the output file herehttps://github.com/mhammell-laboratory/TEtranscripts/files/10947346/TEcount_out.cntTable.gz. I wonder if this is something related to the use of Singularity. Is the TETranscript1.sh that was referenced in the log file just the Singularity command?
Thanks.
— Reply to this email directly, view it on GitHubhttps://github.com/mhammell-laboratory/TEtranscripts/issues/134#issuecomment-1464790083, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6KRNQDMX4NFOQHAYJGN73TW3PMYLANCNFSM6AAAAAAVSVNICI. You are receiving this because you authored the thread.Message ID: @.***>
Hi Linran,
Could you try one more thing? Could you increase the memory request to 80Gb. I wonder if the sudden spike in memory usage on my run was what killed the run. If that's not the case, I'm in the process of getting a version that would suppress the pysam error message (which might have triggered something in singularity or your cluster), and see if that also resolves the issue.
Thanks.
Hi Oliver,
Thanks for your advice. It ran for 1 hour after I asked for 80G of memory, which I thought was successful ((before this was usually killed within 10 minutes). However, it returned with the same error. Increasing the memory does extend its run time, I can try again with more memory. By the way, how long have you been running it?
INFO @ Mon, 13 Mar 2023 15:28:38:
INFO @ Mon, 13 Mar 2023 15:28:38: Processing GTF files ...
INFO @ Mon, 13 Mar 2023 15:28:38: Building gene index .......
100000 GTF lines processed. 200000 GTF lines processed. 300000 GTF lines processed. 400000 GTF lines processed. 500000 GTF lines processed. 600000 GTF lines processed. 700000 GTF lines processed. 800000 GTF lines processed. 900000 GTF lines processed. 1000000 GTF lines processed. 1100000 GTF lines processed. 1200000 GTF lines processed. 1300000 GTF lines processed. 1400000 GTF lines processed. 1500000 GTF lines processed. 1600000 GTF lines processed. INFO @ Mon, 13 Mar 2023 15:43:59: Done building gene index ......
INFO @ Mon, 13 Mar 2023 15:44:12: Building TE index .......
INFO @ Mon, 13 Mar 2023 15:48:40: Done building TE index ......
INFO @ Mon, 13 Mar 2023 15:48:40: Reading sample file ...
job_number: 9842579 exec_file: job_scripts/9842579 submission_time: Mon Mar 13 15:25:08 2023 owner: linrwang uid: 12154 group: cs_external gid: 14800 sge_o_home: /home/linrwang sge_o_log_name: linrwang sge_o_path: /home/linrwang/miniconda3/bin:/home/linrwang/miniconda3/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/pdsh/bin:/opt/rocks/bin:/opt/rocks/sbin:/opt/gridengine/bin/lx-amd64:/home/linrwang/.local/bin:/home/linrwang/bin sge_o_shell: /bin/bash sge_o_workdir: /home/linrwang/infectiondata/ref sge_o_host: pchuckle account: sge cwd: /home/linrwang/infectiondata/ref merge: y hard resource_list: h_vmem=80G,tmem=80G mail_list: @.*** notify: FALSE job_name: TEMOCK2 jobshare: 0 shell_list: NONE:/bin/bash env_list: TERM=NONE script_file: TETranscript1.sh project: external binding: NONE job_type: NONE usage 1: cpu=01:00:25, mem=13351.09209 GB s, io=29.59654 GB, vmem=4.658G, maxvmem=5.363G binding 1: NONE scheduling info: (Collecting of scheduler job information is turned off)
Many thanks, Linran
From: Oliver Tam @.> Sent: 13 March 2023 13:28 To: mhammell-laboratory/TEtranscripts @.> Cc: Wang, Linran @.>; Author @.> Subject: Re: [mhammell-laboratory/TEtranscripts] TEcount was killed after receiving the error: "Could not retrieve index file for '.1677762396.6834505.bam" (Issue #134)
⚠ Caution: External sender
Hi Linran,
Could you try one more thing? Could you increase the memory request to 80Gb. I wonder if the sudden spike in memory usage was what killed the run. If that's not the case, I'm in the process of getting a version that would suppress the pysam error message (which might have triggered something in singularity or your cluster), and see if that also resolves the issue.
Thanks.
— Reply to this email directly, view it on GitHubhttps://github.com/mhammell-laboratory/TEtranscripts/issues/134#issuecomment-1466145919, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6KRNQALG3NP66VEE5QTIN3W34OH7ANCNFSM6AAAAAAVSVNICI. You are receiving this because you authored the thread.Message ID: @.***>
Hi Linran,
Your previous file ran for 4 hrs before completion.
It might be worth trying with more memory. I'm surprised that this is an issue, especially since your cluster is indicating that it only used around 5Gb of memory (my output says maxvmem=73G
).
Sorry that it's not more helpful. However, it does appear that the pysam error message is not the cause.
Thanks.
Hi Oliver,
I'm happy to tell you that I finally solved this problem! I think it's due to my cluster. I successfully ran it in my newly applied cluster environment, and I did not require a lot of memory, and finally got the output after running it for 2 hours with 20G. I'm very grateful for your help and I wish you the best.
All the best, Linran
From: Oliver Tam @.> Sent: 13 March 2023 16:51 To: mhammell-laboratory/TEtranscripts @.> Cc: Wang, Linran @.>; Author @.> Subject: Re: [mhammell-laboratory/TEtranscripts] TEcount was killed after receiving the error: "Could not retrieve index file for '.1677762396.6834505.bam" (Issue #134)
⚠ Caution: External sender
Hi Linran,
Your previous file ran for 4 hrs before completion. It might be worth trying with more memory. I'm surprised that this is an issue, especially since your cluster is indicating that it only used around 5Gb of memory (my output says maxvmem=73G). Sorry that it's not more helpful. However, it does appear that the pysam error message is not the cause.
Thanks.
— Reply to this email directly, view it on GitHubhttps://github.com/mhammell-laboratory/TEtranscripts/issues/134#issuecomment-1466518426, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6KRNQFIIWE5ARC35DXM5NLW35GB3ANCNFSM6AAAAAAVSVNICI. You are receiving this because you authored the thread.Message ID: @.***>
Hi Linran,
I'm glad to hear that the problem is solved. Please let us know if you encounter other issues.
Thanks.
Hello! I encountered the following problems when running TEcounts: [E::idx_find_and_load] Could not retrieve index file for '.1677762396.6834505.bam'
When I see others encounter the same problem, it doesn't affect the output. But my script stops running here, and I get no other output. I'm running this in an HPC environment, I think I’ve asked for enough memory and time so it shouldn't be due to this. I'm still new to bioinformation and linux, if anyone can help I'd appreciate it! Thank you!
My scripts:
My output: