Closed rtmag closed 6 years ago
Hi. This looks like an issue with the updated pysam (>0.9) and samtools (>1.3) where the old samtools sort
command no longer works. We are in the process of adding the ability to handle a newer pysam and samtools, and should hopefully get this into the code soon.
Thanks, and apologies for the slow responses.
Hi, I've got the same problem. I was wondering if you had found a work around since then? Thank you in advance
Hi, there is a work around using the samtools to sort the bam files according to read name before running TEtranscripts. Thanks.
It worked. Thanks
@retrogenomics What do you mean it worked? I'm getting the same error. Could you please tell me how you fixed it? Thank you very much in advance.
@vasilislenis It worked as suggested by @yingjin07, i.e. by using samtools to sort the bam files according to read names rather than by coordinates (samtools sort -@4 -O BAM -n file.bam -o file.sortedByReadname.bam
) . Then you can run TEtranscripts without getting the error.
@retrogenomics Thank you very much for your reply but unfortunately, it didn't work. I'm still getting the same error. I am using the SE testing files.
@vasilislenis. What is your TEtranscripts
command line? Once you use samtools sort
, you should remove the --sortByPos
parameter from the TEtranscripts
command (that parameter calls samtools sort
again, which assuming that you are using a newer version of samtools
, might be causing the issue). Let me know if that does not resolve the issue. Thanks!
@olivertam. Yes, you are right! I haven't excluded the --sortByPos command. Thank you!
Now I have a new error:
`INFO @ Thu, 18 Jan 2018 16:52:15: Finished processing sample files INFO @ Thu, 18 Jan 2018 16:52:15: Generating counts table CRITICAL @ Thu, 18 Jan 2018 16:52:25: Error in running differential analysis!
CRITICAL @ Thu, 18 Jan 2018 16:52:25: Error: [Errno 2] No such file or directory
CRITICAL @ Thu, 18 Jan 2018 16:52:25: [Exception type: OSError, raised in subprocess.py:1249]`
At least it generated the table counts which is all that I need. I was expecting to generate 2 different counts tables (one for coding and one for non-codding) but I found one with all the counts. Is it ok?
@vasilislenis There should be only one count table (both coding genes and TE in one). You can easily separate them out by searching for the :
in the name (which should only appear in TE). It is not immediately clear what your new error is (other than a missing file). Are you using the test data, or your own? If you don't mind providing the command line (you can replace any file names with placeholders if you prefer), that would be great. Thanks.
@vasilislenis Also, was a [prefix]_DESeq.R
file generated in addition to the counts table? Thanks.
@olivertam. I am really sorry! I forgot to load the R module. Based on the log file everything went well. It was the test data, so now I will try with mine. I'll let you know how it goes. Thank you very much!
I have the same problem: [Exception type: SamtoolsError, raised in utils.py:75]
so I sort them by name and re-run TEtranscript, but got another error:
If the BAM file is sorted by coordinates, please specify --sortByPos and re-run!
Hi. If you could paste a copy of the log file for the second run, and perhaps the header and the first 10 or so alignments of the sorted BAM file, I can take a closer look. Thanks.
Thanks!
Here is my file:
Here is the logfile:
Jingyi
Hi Jingyi,
Looking at the BAM file, it is odd that it still looks like it is sorted by co-ordinates. I see chr1
for the first batch of alignments, and they appear to be arranged in numeric order (see column 2). You can also see that although the first 8 lines appear to be pairs of alignments, line 9 and line 12 appear to be a pair, but are separated by two other alignments.
It might be worth double-checking the headers to see if the files are sorted correctly. The header should have a line that looks like the following:
@HD ... SO:queryname
and not
@HD ... SO:coordinate
Thanks.
Thanks so much. It is indeed sorted by coordinate. Let me re-sort them again.
Have a good weekend. Jingyi
On Mar 30, 2018, at 3:30 PM, Oliver Tam notifications@github.com wrote:
Hi Jingyi, Looking at the BAM file, it is odd that it still looks like it is sorted by co-ordinates. I see chr1 for the first batch of alignments, and they appear to be arranged in numeric order (see column 2). You can also see that although the first 8 lines appear to be pairs of alignments, line 9 and line 12 appear to be a pair, but are separated by two other alignments. It might be worth double-checking the headers to see if the files are sorted correctly. The header should have a line that looks like the following: @HD VN1.4 SO:queryname and not @HD VN1.4 SO:coordinate Thanks.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mhammell-laboratory/tetoolkit/issues/11#issuecomment-377603529, or mute the thread https://github.com/notifications/unsubscribe-auth/AhU3K0pTq3AzrOdDZ3rPiD4mnudQofWQks5tjofrgaJpZM4MXoF2.
Issue relating to the newer version of samtools/pysam should be addressed in TEToolkit v2.0.1. Note that pysam requirement is now v0.9.0 or higher
Hello!
there seems to be a problem with samtools, any idea on what might be the problem?
Cheers!