wikipathways / wikipathways.org

The main web site for the WikiPathways project.
http://wikipathways.org
GNU General Public License v2.0
9 stars 8 forks source link

gmt download directory is missing several species #70

Closed khanspers closed 6 years ago

khanspers commented 6 years ago

The gmt folder of downloads only contains gmt files for 17 of the 25 species in the Analysis Collection. Missing species:

Solanum lycopersicum Hordeum vulgare Zea mays Gibberella Zeae Bacillus subtilis E.coli Mycobacterium tuberculosis Plasmodium falciparum

khanspers commented 6 years ago

Another related issue (renamed this ticket): The gmt files for some species don't match contents of corresponding gpml dir. For example, At gmt file lists 18 pathways, but the gpml dir has 26 pathways. Also observed in other species (Hs, Os, Ce...).

mkutmon commented 6 years ago

I'll check when I am back at work but it only includes pathways that have at least one identifier that can be mapped to entrez gene. If the pathway doesn't have any identifier that maps to entrez gene then the pathway is not added. That is probably the reason why the numbers don't match. Species for which gene identifiers are not available in entrez gene, no gmt file is created.

On Wed, Jul 18, 2018, 20:11 Kristina Hanspers notifications@github.com wrote:

Another related issue (renamed this ticket): The gmt files for some species don't match contents of corresponding gpml dir. For example, At gmt file lists 18 pathways, but the gpml dir has 26 pathways. Also observed in other species (Hs, Os, Ce...).

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/wikipathways/wikipathways.org/issues/70#issuecomment-406024402, or mute the thread https://github.com/notifications/unsubscribe-auth/ACDvB6ugbDxiNaDSHHZ5CBlf12gzSAwwks5uH3pOgaJpZM4VTwli .

khanspers commented 6 years ago

Right, I checked and this is the reason for both of these issues. Closing.