MetabolicAtlas / data-generation

Process the raw data-files for ingestion into the Neo4j database
MIT License
0 stars 0 forks source link

feat: multiple data overlay sources #20

Closed e0 closed 3 years ago

e0 commented 3 years ago

This is related to MetabolicAtlas/private-issues#113.

Most of the data overlay processing logic is encapsulated in a single file to try to keep the functionality of this repo more modular. For more info, please refer to the doc comment for the processDataOverlayFiles function.

For testing, please use this branch from data-files. To validate that the data is correctly processed, make sure that the faulty rows mentioned in https://github.com/MetabolicAtlas/data-files/pull/14 are not included in the output.

e0 commented 3 years ago

Very nice!

Just a question about placing the dataOverlay output folder under /data/, which is copied into the neo4j container to import the database. How about having the output at the root, ie /dataOverlay?

Good point, I think that is a reasonable suggestion. I would actually propose renaming this repo, as it is not only processing and outputting data for the neo4j database. Then we can rename the current data folder to something like neo4j or neo4j-import.

mihai-sysbio commented 3 years ago

I would actually propose renaming this repo, as it is not only processing and outputting data for the neo4j database.

Yes! How about simply dropping the neo4j-, ending up with data-generation? I'm also tagging @nanjiangshu @inghylt.

mihai-sysbio commented 3 years ago

The proposed name data-generation has been approved by the team, so I've gone ahead and renamed the repo.

e0 commented 3 years ago

The proposed name data-generation has been approved by the team, so I've gone ahead and renamed the repo.

Thanks! We should update references to it from other repos and docs as well. Are you working on that too or do you want some help there?

mihai-sysbio commented 3 years ago

Thanks! We should update references to it from other repos and docs as well. Are you working on that too or do you want some help there?

I'd be great if someone else can follow up (maybe can be part of current PRs). I wanted to do this so that it doesn't become blocked by permissions.

e0 commented 3 years ago

Thanks! We should update references to it from other repos and docs as well. Are you working on that too or do you want some help there?

I'd be great if someone else can follow up (maybe can be part of current PRs). I wanted to do this so that it doesn't become blocked by permissions.

Ok, I can update the current PRs.

inghylt commented 3 years ago

Even though this is already merged, I just want to add that the code is really clean! Good job :1st_place_medal: