wikipathways / wikipathways.org

The main web site for the WikiPathways project.
http://wikipathways.org
GNU General Public License v2.0
9 stars 8 forks source link

OWL files not available for some PWs #73

Open DeniseSl22 opened 6 years ago

DeniseSl22 commented 6 years ago

Hi all,

Someone I'm collaborating with wants to use the .owl files for the visualisation of his data (which cannot be done in PV unfortunately). However, the bulk download doesn't incorporate the owl file format.

I tried to manually download the list that he send me, but there are some PWs which I can also not download manually, e.g. https://www.wikipathways.org/index.php/Pathway:WP4290 and https://www.wikipathways.org/index.php/Pathway:WP4297 give the following error when I click on "download" and then select Biopax level 3 .owl --> image

Several other PWs I could download like this (e.g. WP134_r94935' 'WP1913_r93880', 'WP1925_r93769').... which included some Reactome PWs, nice to see that this works!.

I'm guessing that the code creating the owl file is here: https://github.com/wikipathways/wikipathways.org/blob/92e2bb99b3e564e25ba13f557d631e7e5459ca34/wpi/rdf/Biopax_level3.java

So could it be that there are some elements in the PW, making it unable to be stored in OWL format?

In both PWs that I cannot download like this, there are some "graphical elements" showing different cellular components. Could it be that these are causing the error when creating the owl file? I checked another PW I know has these elements in there (its from a student of mine, https://www.wikipathways.org/index.php/Pathway:WP4304 ), and again I cannot download the owl file... Hope someone can help me along with this :)

DeniseSl22 commented 6 years ago

I checked with Jonathan, he can get the .owl file by downloading the GPML file, opening this in Pathvisio, and using the Biopax-plugin to export the pathway as an OWL file (with the content seeming to match to the PW)... but this a way to work around it, would be nice if it works on the website directly.

AlexanderPico commented 6 years ago

I don't think our OWL format is very good, frankly. So, I'd double check why the collaborator want OWL. I'd recommend the RDF or any other format over OWL for our content.

DeniseSl22 commented 6 years ago

Yeah, there was a tool he could use to visualise the pathways which he could automate into his workflow. I said: use cytoscape.... but since he had never used it before (and learning it takes some time), he decided to test the visualisation with what he knows. But that doesn't take away the fact that the .owl converter is not working properly. For what other reason can the .owl file be used?

Chris-Evelo commented 6 years ago

Did you also mention PathVisioRPC?

DeniseSl22 commented 6 years ago

Nope; didn't know that existed.... knew we could do some "simple" pathway analysis in R with Bioconductor package, is that related? But I have to say that the data that has to be visualised is too complex to put in one PV node..... last dataset had images with 25.000 pixels (so an X and Y coordinate and an intensity for the colour)..... Don't know how to tell PV that the coordinates are needed to put the intensity at a certain place. Tina showed me how to do it in Cytoscape, but like said before, that is not so easy to learn in an hour.....

And this doesn't change the problem with wrong/not existing .owl files...

Chris-Evelo commented 6 years ago

Agree, might not work. Still good to know it exists. It was actually created for automation of analysis and visualisation: http://dx.doi.org/10.1186/s12859-015-0708-8

fehrhart commented 5 years ago

Hi all, today I checked 18 pathways for something different (title overlap) and also run over this problem. I saw that if the owl download does not work, also .txt, pwf, and pdf does not work, either. Maybe this is not an owl issue but more general problem?

Of these 18 pathways, the download worked only for 4 pathways!

These are the pathways on which the download of these items does NOT work: WP363, WP3795, WP2788, WP1422, WP4050, WP134, WP61, WP268, WP268, WP3381, WP4249, WP500, WP4072, WP1795, WP197

And these are those, of which all downloads work: WP47, WP531, WP383, WP428

Best regards, Freddie

DeniseSl22 commented 5 years ago

I think we also discussed this in SF; since the drawings in PV are very flexible, some things do not match up with the Biopax format, and therefore the OWL files cannot be created. I wasn't at that discussion, so perhaps someone that was could explain the current status related to Biopax conversion? @mkutmon @ariutta ?

Chris-Evelo commented 5 years ago

The only thing I remember discussing there is that we mentioned that using the RDF creation process might work as a better start for valid BioPax files than using the current exporter. The RDF creator is a lot newer and also models towards BioPax

DeniseSl22 commented 5 years ago

Well, that sounds like a good idea. Then we can also perform unification to one type of identifier for Biopax (iso the original IDs we have now). I would suggest to unify to ENSEMBL for geneproduct/protein DataNodes and Wikidata for metabolites (these are the databases we tested for our network approaches, and give unique identifiers for nodes).

Chris-Evelo commented 5 years ago

Agree. Note that ENSEMBL is already in the RDF (as are NCBI genes and UniProt), for metabolites we currently have HMDB and CheBI I think. You might want to extend the RDF with WikiData first.

DeniseSl22 commented 5 years ago

Wikidata is in the RDF (for metabolites) ;)

egonw commented 5 years ago

Only for the metabolites, not for the genes, proteins, etc. I have been playing with the idea where/when to do this, but I think the best moment is when we actually generate the RDF. That uses BridgeDb so these mappings from Wikidata should go into the Derby files for genes/proteins... but that is pending me actually being able to update that source code, which is currently not possible, see the bug reports here: https://github.com/bridgedb/create-bridgedb-genedb/issues (particularly https://github.com/bridgedb/create-bridgedb-genedb/issues/4 and https://github.com/bridgedb/create-bridgedb-genedb/issues/5).

Chris-Evelo commented 5 years ago

I don't think that that is needed for the current problem. We could indeed create BioPAX using ENSEMBL for gene products and WikiData for metabolites. Basically, that sounds like we should just give it a try and see what the BioPAX validator thinks.