malariagen / ag1000g-phase3-data-paper

Other
1 stars 2 forks source link

Add nb to calc, store PCA data #41

Closed leehart closed 3 years ago

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

leehart commented 3 years ago
  • I'd rather use the ag3.py file to load data, perhaps we'll shift to intake in the future, but better to stick with what we are using for consistency?

Wouldn't it be better to change the other notebooks to use intake instead? Then we'll have consistency and better future-proofing. Intake is seems more intuitive and cleaner to me. Having to develop and maintain this ad hoc class import that we expect will become deprecated doesn't seem right. Also, even if there are two different ways of getting the data, this will at least demonstrate another way, maybe a better way, so it's difficult to see that minor inconsistency causing major problems -- but I do appreciate the value of maintaining consistency as well, for the right reasons. Didn't we recently declare that these notebooks are primarily just for transparency, rather than some sort of how-to guide?

I think I probably did use ag3.py before switching to intake, which I took to be the prefered route, judging from this guide that was posted on Discourse: https://nbviewer.jupyter.org/gist/hardingnj/ffa0ee973fa59f3c11d0b6e3e7f64dfd

I don't like using the ag3.py file, whereas I like using intake. Having said that, I'm happy to make that change.

leehart commented 3 years ago

I'm tempted to remove all of the Bokeh plotting from this nb, and just leave the Matplotlib checks, which will simplify/focus the nb on just producing the data. Originally it was handy to have the Bokeh plots (interactive HTML) to check that sample metadata aligned with points on the PCA plot, but it's feeling like overkill / mission-drift now.

leehart commented 3 years ago

Could we go up to 10 PCs in each analysis Lee? Otherwise looks good to me. Thanks!

Sure. We had 11 originally, and you asked for the change to 4, so why the change of mind?

hardingnj commented 3 years ago

Could we go up to 10 PCs in each analysis Lee? Otherwise looks good to me. Thanks!

Sure. We had 11 originally, and you asked for the change to 4, so why the change of mind?

I think that was a miscommunication on my part- apologies. I just wanted 4 in the plots that we were working with. May as well have the 10/11 if generating the data.

leehart commented 3 years ago

Bumping for review/merge, since #37 is already using the output from this nb.