Open katsikora opened 7 years ago
Hi,
Can you print the values in "mag.filt.net"? It seems the issue is most likely at the level of clustering contigs (done using RapClust).
Regards, Laraib
Hello Laraib,
awk '{ print $3 }' mag.filt.net | sort | uniq 1.1
It's the same as in grassGraph.txt, then.
Thanks for looking into it,
Best,
Katarzyna
Since RapClust directly uses the results obtained from Salmon/Sailfish, I'm guessing the issue is caused due to some problem at the level of read mapping. Going over the equivalence classes dumped by Salmon/Sailfish should help figure that out.
Let us know if you need help with that.
-Laraib
Hi,
so I looked into the equivalence classes of 1 of the multiple datasets I want to analyze.
If I get this right, then all the classes in that dataset have only 1 member transcript:
awk '( NR>120263 )' eq_classes.txt | awk '{ print $1 }' | sort | uniq
1
I ran Salmon (v. 0.7.2) as: salmon quant -i $REFDIR/161221salmon.quasi -l ISR -1 <(zcat ${R1fq[$i]} ) -2 <(zcat ${R2fq[$i]} ) -o $runDir -p 8 --maxOcc 20 --maxReadOcc 1 --coverage 0.90 --dumpEq .
Do you think any of the parameters might cause this behaviour?
Best,
Katarzyna
Hi Katarzyna,
Can you tell us what the mapping rate of Salmon is? In the quantification directory of one of your samples, there is a directory called aux_info
, and within this directory, a file called meta_info.json
. Can you post the contents of that file? That will help us determine further what might be going on.
Best, Rob
Hi Rob,
here goes the file content:
{ "salmon_version": "0.7.2", "samp_type": "none", "num_libraries": 1, "library_types": [ "ISR" ], "frag_dist_length": 1001, "seq_bias_correct": false, "gc_bias_correct": false, "num_bias_bins": 4096, "mapping_type": "mapping", "num_targets": 120261, "num_bootstraps": 0, "num_processed": 103236790, "num_mapped": 42745806, "percent_mapped": 41.405593877918903, "call": "quant", "start_time": "Tue Jan 10 10:51:56 2017" }
Thanks for your help,
Best,
Katarzyna
Dear Laraib,
I have rerun GRASS on Salmon equivalence classes obtained with near-default parameters. I now get a number of transcripts per class, also the grassGraph.txt file shows a distribution of edge values.
head grassGraph.txt 131_Gills_comp118255_c0_seq1 131_Gills_comp118255_c0_seq1 1.1 145_Blood_comp87177_c6_seq6 145_Blood_comp87177_c6_seq6 1.1 138_Blood_comp131799_c1_seq1 138_Blood_comp131799_c1_seq1 1.1 131_Blood_comp119291_c0_seq1 136_Gills_comp98921_c0_seq1 0.0510510510511 145_Typhlosole_comp85224_c1_seq2 145_Typhlosole_comp85224_c1_seq2 1.1 133_Typhlosole_comp103582_c0_seq1 133_Typhlosole_comp103582_c0_seq1 1.1 131_Blood_comp135226_c0_seq1 131_Blood_comp135226_c0_seq1 1.1 131_Gills_comp165313_c0_seq1 131_Gills_comp165313_c0_seq1 1.1 130_Kidney_comp52583_c0_seq1 136_Blood_comp120067_c1_seq2 0.838095238095 140_Kidney_comp68393_c4_seq2 142_Kidney_comp67810_c0_seq1 0.97558685446
Still, I get the same error from junto as before:
Exception in thread "main" java.lang.RuntimeException: Non-positive weighted edge:>>136_Blood_comp125844_c2_seq5-->136_Blood_comp125844_c2_seq5<< -1.1 at upenn.junto.algorithm.Adsorption$$anonfun$run$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$1.apply(Adsorption.scala:179) at upenn.junto.algorithm.Adsorption$$anonfun$run$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$1.apply(Adsorption.scala:169) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at upenn.junto.algorithm.Adsorption$$anonfun$run$1$$anonfun$apply$mcVI$sp$1.apply(Adsorption.scala:169) at upenn.junto.algorithm.Adsorption$$anonfun$run$1$$anonfun$apply$mcVI$sp$1.apply(Adsorption.scala:162) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at upenn.junto.algorithm.Adsorption$$anonfun$run$1.apply$mcVI$sp(Adsorption.scala:162) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at upenn.junto.algorithm.Adsorption.run(Adsorption.scala:155) at upenn.junto.app.JuntoRunner$.apply(Junto.scala:72) at upenn.junto.app.JuntoConfigRunner$.apply(Junto.scala:121) at upenn.junto.app.JuntoConfigRunner$.main(Junto.scala:132) at upenn.junto.app.JuntoConfigRunner.main(Junto.scala)
Is there something in the treatment of self-directed edges that may be causing this?
Best regards,
Katarzyna
Hi,
I am not sure why you are getting this error. We haven't done anything for treating self-directed edges differently yet. Can you share your GRASS graph file (grassGraph.txt), the seed file (seed.txt) and the junto config file ("junto.config")? I will try to reproduce the error and see what's going on.
Also, how did you install junto? Did you git clone and compile it?
Thanks, Laraib
Hi Laraib,
attached is grassGraph.txt, junto.config pasted below, but seed.txt is empty. Let me know if you can make any sense out of this,
Best,
Katarzyna
cat junto.config seed_file = /data/processing3/sikora/sikora/Trinity.grass/seed.txt output_file = /data/processing3/sikora/sikora/Trinity.grass/tempOutput graph_file = /data/processing3/sikora/sikora/Trinity.grass/grassGraph.txt data_format = edge_factored iters = 1 prune_threshold = 0 algo = adsorption
Dear developers,
I encounter the following error when running GRASS:
Exception in thread "main" java.lang.RuntimeException: Non-positive weighted edge:>>136_Blood_comp125844_c2_seq5-->136_Blood_comp125844_c2_seq5<< -1.1
As far as I can tell this happens after running blast, when calling junto. That problematic edge is self-directed (same source and target sequence). The contents of my run directory after program exit are:
total 102M drwxr-xr-x 1 sikora bioinfo 36 Jan 11 12:56 AS -rw-r----- 1 sikora bioinfo 73M Jan 11 13:01 AS.TSdb drwxr-xr-x 1 sikora bioinfo 36 Jan 11 12:56 TS -rw-r----- 1 sikora bioinfo 0 Jan 10 15:23 TS.ASdb -rw-r----- 1 sikora bioinfo 7.1M Jan 11 13:01 grassGraph.txt -rw-r----- 1 sikora bioinfo 288 Jan 11 12:54 junto.config -rw-r----- 1 sikora bioinfo 3.3M Jan 11 12:56 mag.clust -rw-r----- 1 sikora bioinfo 7.1M Jan 11 12:56 mag.filt.net -rw-r----- 1 sikora bioinfo 4.8M Jan 11 12:56 mag.flat.clust -rw-r----- 1 sikora bioinfo 7.1M Jan 11 12:55 mag.net -rw-r----- 1 sikora bioinfo 0 Jan 10 15:28 seed.txt -rw-r----- 1 sikora bioinfo 0 Jan 10 15:28 seedLabels.txt -rw-r----- 1 sikora bioinfo 88 Jan 11 12:56 stats.json
The TS.ASdb is empty. Curiously, all values in the grassGraph.txt are equal 1.1:
awk '{ print $3 }' grassGraph.txt | sort | uniq
1.1
Would you have any idea what might be the problem?
Best regards,
Katarzyna