bokulich-lab / q2-fondue

Functions for reproducibly Obtaining and Normalizing Data re-Used from Elsewhere
BSD 3-Clause "New" or "Revised" License
20 stars 6 forks source link

Data fetching by BioProject ID #29

Closed misialq closed 3 years ago

misialq commented 3 years ago

~Requires #26 to be merged first.~

Closes #13, #24, #25.

~Note to self: rebase after #26 is merged.~

fetch_by_project_overview

LenaFloerl commented 3 years ago

Tested with some BioProject IDs and works perfectly fine!

(some issues when fetching the PR upstream because some files were deleted upon switching to the new branch ❯ git checkout fondue-pr-test2 D q2_fondue/__init__.py D q2_fondue/_version.py etc but restoring these before installing solved that - thanks again @misialq!)

lina-kim commented 3 years ago

If the CLI command get-all can be run with input BioProject IDs, is there a way of reconfiguring get-sequences to as well? It feels a little unbalanced to have the option for get-metadata and get-all, but not get-sequences.

misialq commented 3 years ago

Hey @lina-kim, thanks for your review! Yes, I absolutely agree - it would be best to have them all the same... It wasn't very straightforward, though, as to make it work we would probably need to add some parts of the efetcher into get_sequences. So maybe this could be a follow-up of this PR - let's discuss that on Thursday!