MEGA-GO / MegaGO

Calculate semantic distance for sets of Gene Ontology terms
MIT License
5 stars 2 forks source link

Obtain output in matrix like text file #49

Open LucvZon opened 2 years ago

LucvZon commented 2 years ago

When running megago with multiple files, the output with semantic similarities gets printed out in the terminal in an odd format. A print for each sample comparison appears with their respective scores and sample names always get overwritten to "sample 0 and 1", "sample 1 and 2", etc.

I see that the heatmap option helps the interpretation by visualizing the scores and displaying sample names as the original file names. However, I would like to obtain a matrix like text file output of the semantic similarity scores, which would essentially be the same data on which the heatmap is based on. Currently I don't see any options to output the similarity scores in such a format and parsing the terminal print seems impractical, especially with the sample names being changed in the output.

Is there any possibility to achieve such a matrix like text file output of the scores?

pverscha commented 2 years ago

Hi,

Since users are also allowed to directly pass sets of GO-terms instead of normal files, we cannot extract file names to use as row and column names in all cases. That's why the output sets is renamed as sample 0, sample 1, etc. The numbering of the samples should be the same as the order in which you provided the samples, so you should be able to parse the file and reset the names back to their original name.

If only files are provided (and no in-place sets of GO-terms), we could possibly add the sample names to the output file. I will take this issue into account and add it to the backlog of things that we would like to add to the MegaGO package. Thank you for your feedback! Feel free to reach out to us in case you have more issues.