hallamlab / metapathways2

MetaPathways v2.0: A master-worker model for environmental Pathway/Genome Database construction on grids and clouds
http://hallam.microbiology.ubc.ca/MetaPathways/
33 stars 14 forks source link

Current RPKM script has major bugs #89

Closed cmorganl closed 7 years ago

cmorganl commented 8 years ago

Hey,

The rpkm files in executables/source/rpkm.tar from commit https://github.com/hallamlab/metapathways2/commit/a7a826ce4437ac78af1f58162ecbf0342193ac3a are completely unusable. Instead of the actual ORF names (as provided by the GFF file) the "ORF" name is really the contig name with a _(contig number - 1) appended to it. For example, say you have a contig named sampleG_13 and in the GFF file there are 3 ORFs attributed to that contig: sampleG_13_1, sampleG_13_2, sampleG_13_3. The output would only contain a RPKM value for sampleG_13_12 (which doesn't exist).

The version in commit https://github.com/hallamlab/metapathways2/commit/5b9b9bf2e216bf9cc21eb7e585a6bb2be9d89a4d does seem to be working fine though, apart from the buffer-overflow, memory-leak and command-line interface issues addressed in the following commit.

cmorganl commented 8 years ago

I should mention that metapathways2 is currently using the buggy version of RPKM -- this isn't a bug from a random commit.

hallamlab commented 7 years ago

this has been fixed!