geniusrise / healthcare

A collection of Bolts and Spouts for healthcare, including knowledge graphs - SNOMED-CT, NER, clinical notes bot
https://docs.geniusrise.ai
2 stars 1 forks source link

Request for Setup Instructions #1

Open crkarthik11 opened 1 month ago

crkarthik11 commented 1 month ago

Hello

I came across the Geniusrise framework repo and found it very interesting, so I was trying to set up this demo project and saw some of the model files were missing, I was wondering if the below files are publicly available. `

networkx_graph="./saved/snomed.graph" \
faiss_index="./saved/faiss.index.Bio_ClinicalBERT" \
concept_id_to_concept="./saved/concept_id_to_concept.pickle" \
description_id_to_concept="./saved/description_id_to_concept.pickle" \

Thanks

ixaxaar commented 1 month ago

Hey those are very large files and cannot be uploaded into github, also you need to sign a license etc with NIH so I cannot distribute them. (healthcare data, has various restrictions, but not that hard to obtain - sign up, apply, takes max a week for approval)

check this out for links: https://github.com/geniusrise/awesome-healthcare-datasets

I can help you generate them using the dataset once you get your hands on the raw data!

crkarthik11 commented 1 month ago

Hi @ixaxaar ,

Thanks for the response. I have access to UMLS, Snomed CT International Version, and MIMIC 3. Would I need any other datasets?

I tried following this link docs but I think it's still a work in progress. I would be happy to document the steps I took to set up and possibly raise a pull request to complete the steps in docs.

Thanks.

ixaxaar commented 1 month ago

Awesome!

look at tests/test_load.py

all you have to do is execute the unit test - test_load_snomed_into_networkx and it will create everything, you can also choose what to create etc, there are 2 more tests there e.g. test_load_snomed_into_networkx_no_index or comment out parts you do not want to generate.


Oh yep, this was a demo which I maintained for some time before deciding to pivot on building geniusrise as a platform instead.

I'm recently still working on it btw, have been integrating:

  1. Disease ontology
  2. Gene Ontology
  3. LOINC
  4. MESH
  5. RXNORM
  6. UMLS

If you'd like, give me maybe a week and I'll have all of the above in and linked together into one large graph.

ixaxaar commented 1 month ago

The doc btw is here -> https://github.com/geniusrise/docs/blob/master/docs/guides/dev_cycle.md

This project is too big for me to maintain alone, contributions are always super welcome!

crkarthik11 commented 1 month ago

Sounds great, I will try to set up tests/test_load.py and get back with some updates.

crkarthik11 commented 2 weeks ago

Hi

I see there are some new updates to the repo, but I cannot find the test files in the latest update.

ixaxaar commented 2 weeks ago

Hey I managed to load a bunch of graphs but still joining them. Each graph has its own nuance etc so taking time. I was using the base file to load each graph to test, like python ./geniusrise_healthcare/knowledge_graphs/base.py.

Also the scripts to download data is also there in scripts/.

crkarthik11 commented 6 days ago

Hey, I am able to load snomedCT concepts into the graph using the scripts, I have also raised a pull request with some minor fixes. I observed that some of the APIs have also been removed i will wait for new commits. Is there any help required in testing please let me know.

ixaxaar commented 5 days ago

Hey thanks man! I've created this PR to track what I'm currently up to -> https://github.com/geniusrise/healthcare/pull/3 I'm adding APIs to these KGs in 3 ways:

  1. graph apis
  2. lucene-based reverse index searches of these graphs
  3. faiss based semantic searches of these graphs

after this I'm gonna proceed to integrating various biomedical sources - e.g. papers and other shit (bioarxiv, medarxiv, NCBI APIs etc)

I've been also thinking of integrating gget - that way a lot of *-omics databases can also be integrated

finally it will be time to build agents using these medical sources and openai compatible llms hosted wherever

Let me know if you want to take something up and I can like give you some structured specs etc

crkarthik11 commented 4 days ago

Hi,

It looks like a solid set of features that you have planned, let me know if I can help with any small tasks or modules to begin with.