microbiomedata / nmdc_notebooks

Jupyter Notebooks demonstrating R and Python-based access to NMDC metadata and data
Creative Commons Zero v1.0 Universal
5 stars 0 forks source link

Refactor notebooks to be compliant with berkeley rollout #65

Closed kheal closed 1 month ago

kheal commented 3 months ago

Long running branch to poise notebooks for conversion to Berkeley schema. This addresses: https://github.com/microbiomedata/issues/issues/726

Leave "#TODO" notes in notebooks for where we will need to edit urls when berkeley schema is production

Workflows will check the validity of each notebook when a push is made to this branch.

R notebooks that have been updated and are passing checks in this branch:

python notebooks that have been updated and are passing checks in this branch:

review-notebook-app[bot] commented 3 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

brynnz22 commented 2 months ago

@kheal After doing my changes I noticed that we can't just query workflow_execution_set, we need to specify the type as nmdc:MetagenomeAnnoation because it messes with the results. You'll see even in your diff that you are getting records that aren't metagenome annotation records. So in order to count for that, I added a type variable to the get_id_results function and added this in all the calls. Its not really necessary for the material processing steps or the data generation step (although it is nice to specify type is nmdc:NucleotideSequencing). But I would recommend doing this to your notebooks as well. My counts were back on a similar track to the original notebooks (they were a little off, but I assume this had something to do with re-iding).

Also, I noticed you still have two mentions omics processing in your heading and description in step 6.

brynnz22 commented 2 months ago

Tried to add you as a reviewer but looks like I can't cause its your PR

kheal commented 2 months ago

Good catches @brynnz22. I'll fix those and review your changes this week.

kheal commented 2 months ago

@brynnz22 . I've updated the R taxonomy notebook.

The python taxonomy notebook looks good, but I think we lost step 8's narrative (code looks good, but there's no text describing it anymore).

We also have dill as an import and reference it in the first chunk of text, but don't use it anymore. I should have taken that out earlier (#47), but I missed it. It works fine for me because I have dill loaded in my venv, but it's not consistent with the requirements if someone else pulls it locally to use.

samobermiller commented 1 month ago

Created branch berkeley_nom off of main to update NOM visualizations folder with Berkeley API calls. Didn't branch off of this berkeley_refactor branch because the branch was created after our notebooks. https://github.com/microbiomedata/nmdc_notebooks/pull/80

brynnz22 commented 1 month ago

Created branch berkeley_nom off of main to update NOM visualizations folder with Berkeley API calls. Didn't branch off of this berkeley_refactor branch because the branch was created after our notebooks. #80

@samobermiller this works too! In the future though, you can just merge main into the Berkeley branch to get main's updates into the Berkeley branch and then branch off the Berkeley. But this way is fine too - it just means we'll have to merge a couple of PRs in after Berkeley.

brynnz22 commented 1 month ago

@brynnz22 . I've updated the R taxonomy notebook.

The python taxonomy notebook looks good, but I think we lost step 8's narrative (code looks good, but there's no text describing it anymore).

We also have dill as an import and reference it in the first chunk of text, but don't use it anymore. I should have taken that out earlier (#47), but I missed it. It works fine for me because I have dill loaded in my venv, but it's not consistent with the requirements if someone else pulls it locally to use.

@kheal I added step 8 narrative back in and removed the dill import and reference.

review-notebook-app[bot] commented 1 month ago

View / edit / reply to this conversation on ReviewNB

bmeluch commented on 2024-10-17T17:17:52Z ----------------------------------------------------------------

If we want terms formatted as code in here, the backticks are incomplete