Open Tommeri opened 3 years ago
it looks like your chromosomes aren't being recognised. What genome build are you using?
What happens when you include the --checkchrom flag?
it looks like your chromosomes aren't being recognised. What genome build are you using?
What happens when you include the --checkchrom flag?
Yes, that was the problem. My chromosome names were weird. I changed my genome build to your suggested GRCh38 one and it works ok. Thank you!
Hi, I have the same issue. The --checkchrom flag took out the extra chromosomes but the error remained. I have tried mapping (STAR) with Gencode and Ensembl GRCh38 FASTA. Also, out of curiosity, what is the rationale for using --outSAMtype BAM Unsorted in the STAR run? Samtools will fail to index that, won't it? Thanks in advance for ideas!
Could you check whether all your libraries have mapped successfully, with reads on each chromosome?
Best, Yang
On Thu, May 13, 2021 at 9:58 AM jaro-slamecka @.***> wrote:
Hi, I have the same issue. The --checkchrom flag took out the extra chromosomes but the error remained. I have tried mapping (STAR) with Gencode and Ensembl GRCh38 FASTA. Also, out of curiosity, what is the rationale for using --outSAMtype BAM Unsorted in the STAR run? Samtools will fail to index that, won't it? Thanks in advance for ideas!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/176#issuecomment-840618451, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCMHKHUDL6LOU5Z6R6DTNPSH7ANCNFSM4WGHCF2A .
I checked with samtools idxstats and all libraries have reads on each chromosome
Would you be able to post the entire error that you are encountering?
Best, Yang
On Thu, May 13, 2021 at 11:02 AM jaro-slamecka @.***> wrote:
I checked with samtools idxstats and all libraries have reads on each chromosome
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/176#issuecomment-840659834, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCMXG7CCGOL7IRDGYO3TNPZYTANCNFSM4WGHCF2A .
Parsing...
X:+..20:+..13:+..20:?..Traceback (most recent call last):
File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 534, in <module>
main(options, libl)
File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 14, in main
pool_junc_reads(libl, options)
File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 72, in pool_junc_reads
clu = cluster_intervals(read_ks)[0]
File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 337, in cluster_intervals
current = E[0]
IndexError: list index out of range
Are you running regtools correctly? There shouldn't be 20:? (which means that some reads have unassigned strands).
Best, Yang
On Thu, May 13, 2021 at 11:29 AM jaro-slamecka @.***> wrote:
Parsing... X:+..20:+..13:+..20:?..Traceback (most recent call last): File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 534, in
main(options, libl) File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 14, in main pool_junc_reads(libl, options) File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 72, in pool_junc_reads clu = cluster_intervals(read_ks)[0] File "/home/[username]/leafcutter/clustering/leafcutter_cluster_regtools.py", line 337, in cluster_intervals current = E[0] IndexError: list index out of range — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/176#issuecomment-840676435, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCIC3GOWA2H7XDHCGN3TNP45ZANCNFSM4WGHCF2A .
I am using the parameters from the vignette:
regtools junctions extract -a 8 -m 50 -M 500000 $bamfile -o $bamfile.junc
output example:
Program: regtools
Version: 0.5.2
Minimum junction anchor length: 8
Minimum intron length: 50
Maximum intron length: 500000
Alignment: /data/user/FASTQ/sample1-star-c.bam
Output file: /data/user/FASTQ/sample1-star-c.bam.junc
If I recall correctly, regtools may have updated their options, see https://regtools.readthedocs.io/en/latest/commands/junctions-extract/. You could try to re-run using the newer options paying attention to the -s flag.
Best, Yang
On Thu, May 13, 2021 at 3:52 PM jaro-slamecka @.***> wrote:
I am using the parameters from the vignette: regtools junctions extract -a 8 -m 50 -M 500000 $bamfile -o $bamfile.junc
output example:
Program: regtools Version: 0.5.2 Minimum junction anchor length: 8 Minimum intron length: 50 Maximum intron length: 500000 Alignment: /data/user/FASTQ/sample1-star-c.bam Output file: /data/user/FASTQ/sample1-star-c.bam.junc
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/176#issuecomment-840826885, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCMDCRYNBTQOP55WB6LTNQ3XHANCNFSM4WGHCF2A .
Just looked into the script again and actually, that's what I used, I think it threw an error the first time so I have had the -s flag there all along.
~/regtools/build/regtools junctions extract -a 8 -m 50 -M 500000 -s 2 sample1-star-c.bam -o sample1-star-c.bam.junc
Do you have any recommendation on how to address the reads with unassigned strand, if that's indeed the issue?
Throw them out.
Best, Yang
On Thu, May 13, 2021, 18:40 jaro-slamecka @.***> wrote:
Just looked into the script again and actually, that's what I used, I think it threw an error the first time so I have had the -s flag there all along. ~/regtools/build/regtools junctions extract -a 8 -m 50 -M 500000 -s 2 sample1-star-c.bam -o sample1-star-c.bam.junc Do you have any recommendation on how to address the reads with unassigned strand, if that's indeed the issue?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/176#issuecomment-840896054, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCM6EUHHZPVRFOKWYILTNRPOFANCNFSM4WGHCF2A .
That took care of it. There were indeed several lines with "?" in column 6 of the junc files:
cat sample1-bam.junc | awk '{print $6}' | sort | uniq
-
?
+
So I deleted those lines:
while read JUNC; do
awk '$6 != "?"' $JUNC > ${JUNC/junc/fixed.junc}
done < <( find ./ -type f -name "*junc" )
Now onto the next steps... Thanks so much for your help! Jaro
I run into the same problems, so I threw out those lines with "?" as suggested by Jaro
for juncfile in `ls 19078-09/*junc`; do
echo $juncfile
awk '$6 != "?"' $juncfile > $juncfile.fixed.junc
echo $juncfile.fixed.junc >> test_juncfiles.txt
done
after this I still have the same problems without '--checkchrom'
Parsing...
GL000205.2:-..GL000218.1:-..15:-..3:+..GL000221.1:+..21:+..14:-..KI270727.1:-..GL000220.1:-..16:+..10:+..KI270707.1:-..Traceback (most recent call last):
File "/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 534, in
but after adding "--checkchrom" it works
python2 /leafcutter/clustering/leafcutter_cluster_regtools.py -j test_juncfiles.txt -m 50 -o 19078_09 -l 500000 --checkchrom
Thanks!
Try using the -chrom flag or removing GL/KI chromosomes.
Best, Yang
On Thu, Jun 10, 2021 at 2:12 PM Ying @.***> wrote:
I run into the same problems, and after I throw out those lines with "?", I still have the same problems
Parsing... GL000205.2:-..GL000218.1:-..15:-..3:+..GL000221.1:+..21:+..14:-..KI270727.1:-..GL000220.1:-..16:+..10:+..KI270707.1:-..Traceback (most recent call last): File "/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 534, in main(options, libl) File "/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 14, in main pool_junc_reads(libl, options) File "/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 72, in pool_junc_reads clu = cluster_intervals(read_ks)[0] File "/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 337, in cluster_intervals current = E[0]
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/176#issuecomment-858934295, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCJPC5Z3MZFA7FYFAVLTSEFAHANCNFSM4WGHCF2A .
Thanks, Yang! The -checkchrom flag works! And thanks for pointing me to GL/KL, I did have some of those on the first column of my junc files.
Regarding the 'leafcutter_cluster_regtools.py' error, this due to an empty list resulting from requiring 3 reads. It has nothing to do with the chromosomes. Trivial fix provided in either of these issues: Error if the list read_ks (related to chromosome identifier) is empty #183 list index out of range #184
Regarding the 'leafcutter_cluster_regtools.py' error, this due to an empty list resulting from requiring 3 reads. It has nothing to do with the chromosomes. Trivial fix provided in either of these issues: Error if the list read_ks (related to chromosome identifier) is empty #183 list index out of range #184
Thanks for this information! I wonder why does 'checkchrom' solve the problem? Is it because well formated chromosomes less likely to have <3 reads?
Hello everyone.
I am trying to run the script on 14 different junc files, constructed using regtools with no errors (using the option -s 1). the error is: NC_000022.11:+..NT_187689.1:-..NW_003315946.1:-..NC_000014.9:-..NW_009646199.1:-..Traceback (most recent call last): File "/home/be/tomm/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 534, in
main(options, libl)
File "/home/be/tomm/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 14, in main
pool_junc_reads(libl, options)
File "/home/be/tomm/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 72, in pool_junc_reads
clu = cluster_intervals(read_ks)[0]
File "/home/be/tomm/bin/leafcutter/clustering/leafcutter_cluster_regtools.py", line 337, in cluster_intervals
current = E[0]
IndexError: list index out of range
Is there any fix to this problem yet? Thank you.