SystemsGenetics / KINC

Knowledge Independent Network Construction
MIT License
11 stars 4 forks source link

What qualifies as a useable GEM? #120

Closed JohnHadish closed 4 years ago

JohnHadish commented 4 years ago

This is a minor suggestion that I think would solve unnecessary user confusion.

I am of the opinion that both of the following examples should be qualified as valid GEMs, and that KINC should automatically accept either:

GEM A

geneID A B C
Gene1.0 1 2 3
Gene2.0 1 2 3
Gene3.0 1 2 3

GEM B

A B C
Gene1.0 1 2 3
Gene2.0 1 2 3
Gene3.0 1 2 3

The only difference between them is that 'GEM A' contains geneID in the first column while 'GEM B' does not. Currently, KINC only considers 'GEM B' a valid option. If 'GEM A' is submitted, an error similar to the following will be thrown:

../../src/core/importexpressionmatrix.cpp:130
virtual void ImportExpressionMatrix::process(const EAbstractAnalytic::Block*)
PARSING ERROR
Encountered gene expression line with incorrect amount of fields. Read in 80 fields when it should have been 81. Gene name is pycom01g00030.

While this is an easy fix for the user (just delete the first word), it is rather annoying small thing that I think the user should not have to deal with.

I am therefore of the opinion that KINC should identify if the first row has either the same or -1 entries than the second row, and handle it accordingly. This would solve unnecessary headaches, making both of the above GEMs valid.

As I said before, this is a minor thing, but I think it should be addressed for a better user experience.

SUPPLEMENTAL The Annotation matrix requires the word "Sample" in row 1 col 1. This means that it's format is opposite of the GEM's format.

bentsherman commented 4 years ago

I raised this same issue here #68. I closed it because I was under the impression that GEMs without the RowID entry work now in both Python and R. However if R still generates the RowID then I can add a checkbox in the import-emx analytic to tell KINC to ignore the RowID.

bentsherman commented 4 years ago

This was a really easy change so I just went ahead and added it.