blahah / assemblotron-paper

Paper for Assemblotron
MIT License
0 stars 0 forks source link

Choose a read dataset for the paper #7

Open blahah opened 9 years ago

blahah commented 9 years ago

Features of the dataset:

blahah commented 9 years ago

This yeast dataset, generated for the Trinity paper, is an option:

Advantages of this dataset include:

blahah commented 9 years ago

Doing some preliminary analysis on the yeast dataset, I downloaded it and ran assemblotron without subsampling. A single assembly with SOAPdenovoTrans + transrate takes about 4 minutes on 24 cores of our cluster, so this looks promising for a sweep of perhaps 4 major parameters (K, d, e, t). Unfortunately there's some weird combination of the environment on the cluster and the build of Salmon that is making it segfault on some data. I am working through this with Rob Patro (Salmon developer).

blahah commented 9 years ago

Turns out there was a bug in Salmon that was triggered by this dataset. It's now fixed in Salmon 0.4.2 and transrate 1.0.0, so I've set the parameter sweep running again.

blahah commented 9 years ago

Possible arabidopsis read datasets: