Open manuelsmendoza opened 4 years ago
Greetings,
I am also having a very similar problem. When using the most recent versions of transrate, the results always have the same set of issues. Number and percentage of mapped fragments is always low ( <45%), there are no potential bridges (potential_bridges = 0, always) and values like uncovered and lowcovered bases and contigs will always be very high (with percentages for those being 1 (100%)). This is simalar to issues #243 as well.
Running the exact same samples and assembly through an older version of transrate does not recover this same strange results, with mappings being much (p_fragments_mapped >80%) higher and closer to the expected results.
I've tried multiple approaches, using transrate from the original fork, @abshah's fork (https://github.com/abshah/transrate), @dfmoralesb fork (https://github.com/dfmoralesb/transrate) and even the conda repackage from @lmfaber (https://github.com/lmfaber/transrate_conda). Even when I'm able to avoid problems with dependencies like #240 , the results then return with these errors. I've attached a comparison between the results I get from different versions of the program.
A solution would be very much appreciated, since having to transfer the fq/fq.gz files between machines just to run transrate is troublesome and time consuming solution.
Best regards, Pedro TrateV101vsTrateV103-results.xlsx
@manuelsmendoza and @pmomadeira if you are willing to share the input data privately I can investigate what's happening.
Transrate doens't consider pacbio reads, so it's not appropriate to evaluate a hybrid pacbio/illumina assembly using only the illumina reads - by design your sequencing strategy creates an assembly that includes a lot of information not included in the short reads.
However, it's possible there's a bug or some non-obvious problem happening as well - in which case I'll happily debug.
Greetings @blahah,
My data are regular illumina reads and assemblies so I'm not sure if the hybrid pacbio/illumina assembly is the reason for the issue. I'm using a rnaSpades built assembly with pair-read sample data. We have used it before in our lab and it worked fine, and in fact it still runs fine with an older installation of transrate v1.0.1.
I've been looking at some of the code in dfmoralesb transrate fork and at your original version to see if I could solve some of the issues I've been having. I ended up creating a new fork/branch to test some changes and I was able to reach a solution (here transratev1.0.4.1 ).
By modifying some files I was able to get transrate to work with salmon v1.7.0 and a snap-aligner v2.0.1 while also substituting the deprecated trollop for optimist and the results are coming out as expected now. Mapped contigs were slightly lower than in transrate 1.0.1, but the it's a 3-5% difference, which can just be a result of differences between versions.
I'm still quite new to this, so I'm not sure if I introduced any potential error or unintended changes to the program, so I would be glad if you could check out my branch.
Best regards, Pedro
I am also getting the same results with transrate 1.0.3--bridges=0, and p contigs uncovered/lowcovered=1. @blahah, would it be possible to make the old builds available again? Bintray is no longer in use.
Hi @sericomyxa . I wasn't able to solve the issue using transrate 1.0.3 but I did create a new fork that solved the bridges and mapping issues. Check it out here[https://github.com/pmomadeira/transrate] and see if it solves your issue. Hope it helps!
Hi @blahah and folk!
I'm evaluating different assemblies to build a transcriptome of reference. I'm using a hybrid approach i.e. combining long-reads (PacBio) with short-reads (Illumina). After running the assembly, I've tried to evaluate it and the result is a little weird.
The total number of fragments/reads mapped to the assembly was very low at Read mapping metrics module, only 29% so I aligned the reads using another tool (bowtie2) and the result was much better (84%)... Is something wrong with TransRate metrics? Is it normal to obtain 100% of low covered and uncovered contigs? Is it normal not find any bridge?
May I trust on TransRate to remove all missassembly transcripts and continue the pipeline using only "good" transcripts?
This issue is related to #220 and #208
TRANSRATE LOG
BOWTIE2 ALIGNMENT STATS