monarch-initiative / monarchr

R package for easy access, manipulation, and analysis of Monarch KG data
Other
6 stars 1 forks source link

Coordinating R package projects #7

Open bschilder opened 6 months ago

bschilder commented 6 months ago

Hi there,

I'm in the process of developing another R package, KGExplorer, which seeks to merge knowledge graphs from multiple biomedical resources, with a special focus on Monarch to start. I'm quite partial to the tidygraph representation and plan to use that as one of the primary object classes, in addition to the ontology_DAG class within simona for ontologies.

Some potential integrations:


I was also just having a discussion with @charlieccarey about their R package of the same name:

@charlieccarey's package has some nice functionalities like searching for specific types of interactions, e.g. bioentity_interactions_assoc_w_gene

I'm mainly posting here to see if there's way we could coordinate our efforts. I haven't explored the package in your repo yet but would be happy to make contributions if there's room for it.

All the best, Brian


Brian M. Schilder PhD Candidate Neurogenomics Lab Faculty of Medicine, Department of Brain Sciences UK Dementia Research Institute, Imperial College London CV | bschilder.github.io/CV/CV LinkedIn | linkedin.com/in/brian-schilder Website I | bschilder.github.io/BMSchilder Twitter | twitter.com/BMSchilder Lab | neurogenomics.co.uk

bschilder commented 6 months ago

@oneilsh would you be correct contact point for this?

bschilder commented 6 months ago

Perhaps @cmungall as well

oneilsh commented 6 months ago

@bschilder thank you for the flag! Yes that would be me :) CCing @cmungall, @kevinschaper, and @sagehrke FYI

I think your proposal to coordinate sounds good, we've been planning something specifically for Monarch, but thinking more generally or at least doing so in a way that can support or integrate with other KGs would certainly not hurt. As the commits reveal I haven't worked on this repo in a while and there's not much functionality implemented yet.

I'm also a fan of tidygraph and had planned to use it heavily, building a small Monarch-specific DSL; or more KG-generic, which sounds like your interest. See this little user story example (but don't take it too seriously, especially the proposed data model). I hadn't seen simona before, I presume you mean this one, I don't have a lot of experience with ontologies and KGs so I'm not as familiar with the available packages, especially in R. Our lab and collaborators do quite a lot of good work in Python.

We've gone back and forth a bit on the best way to get the data to the package, between bulk file download, something like a bioconductor data package, or using the REST API (ala @charlieccarey's package, but there've been big graph and API updates recently as you've noticed). It's on our short-term plans however to release a neo4j endpoint for the KG, which I think will be the way to go.

So yeah, let's build slick KG packages for R!

bschilder commented 6 months ago

@bschilder thank you for the flag! Yes that would be me :) CCing @cmungall, @kevinschaper, and @sagehrke FYI

I think your proposal to coordinate sounds good, we've been planning something specifically for Monarch, but thinking more generally or at least doing so in a way that can support or integrate with other KGs would certainly not hurt. As the commits reveal I haven't worked on this repo in a while and there's not much functionality implemented yet.

Amazing, thanks for the positive response @oneilsh !

I think it might be nice to keep it modular, and have monarchr as it's own package that a meta-package can call as a dependency. Prevents the code base from getting exceedingly complicated and lets you unit test your packages a bit easier in some ways. I have a CI-assistant package for making this kind of multi-package maintenance a bit more manageable, called rworkflows.

I'm also a fan of tidygraph and had planned to use it heavily, building a small Monarch-specific DSL; or more KG-generic, which sounds like your interest. See this little user story example (but don't take it too seriously, especially the proposed data model). I hadn't seen simona before, I presume you mean this one, I don't have a lot of experience with ontologies and KGs so I'm not as familiar with the available packages, especially in R. Our lab and collaborators do quite a lot of good work in Python.

Yes, thats' the one! Sorry, pasted the wrong link. simona seems to be quite well maintained and has extensive functions for computing every variety of similarity matrix you can imagine. I'm in the process of transitioning my HPO-specific package (HPOExplorer) to using this format instead of ontologyIndex.

We've gone back and forth a bit on the best way to get the data to the package, between bulk file download, something like a bioconductor data package, or using the REST API (ala @charlieccarey's package, but there've been big graph and API updates recently as you've noticed). It's on our short-term plans however to release a neo4j endpoint for the KG, which I think will be the way to go.

Atm, KGExplorer only pulls in the bulk data files, which isn't ideal for all use cases but surprisingly usable. Would definitely be worth doing some benchmarking to see which approach is fastest, but I do really like the idea of using neo4j as this seems perfectly suited for the Monarch db use case.

So yeah, let's build slick KG packages for R!

Heck yeah! Perhaps the next steps would be have a meeting to flesh out the plans and see what the best way to tackle them might be.

I'd be happy to send out a meeting scheduler invite to figure out when that might work best (after the holidays of course).

oneilsh commented 6 months ago

@bschilder happy post-holidays :)

I would definitely be up for a brainstorming session, it would be nice to get your perspective. Can you send me an email to my shawn@tislab.org address to set up a time and zoom?

bschilder commented 6 months ago

@bschilder happy post-holidays :) Hope you had a great holiday break yourself :)

I would definitely be up for a brainstorming session, it would be nice to get your perspective. Can you send me an email to my shawn@tislab.org address to set up a time and zoom?

Amazing, thanks! I'll send you an email now. If anyone else would like to join the call, do let me know and I can add your emails as well @cmungall @kevinschaper @sagehrke @charlieccarey

sagehrke commented 6 months ago

Thank you @bschilder and @oneilsh! I will sit this one out, but I am excited to hear what comes of this call. 👍🏼

monicacecilia commented 5 months ago

Dear @oneilsh, could I kindly ask if you could please share an update on where this coordination effort ended up? Are there next steps?