Closed abhi18av closed 1 year ago
This could overlap with https://github.com/nextflow-io/nextflow/pull/1611
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This is a good candidate for a Nextflow plugin along the same way of nf-sqldb
I'd be happy to give this a shot 👍
Update:
The work is being done on my fork as of now https://github.com/abhi18av/nextflow/tree/abhinav/nf-sraql , with BigQuery
as the default source.
Once it is presentable, I'll create and link the PR to this repo.
Cool! Willing to make a PR so changes will be more clear?
Absolutely, will make a PR ~by EOD today~ 👍
Initiated the draft PR with the scratch work, happy to receive any feedback.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
WIP - not stale.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Whoa, this went under my radar after the health crisis. Confirming @pditommaso if this is still relevant and I'd be happy to pick this back up and make a push
Not a priority but surely a nice to have. Should not this working via db jdbc connection? What's missing?
I think it is already working for BigQuery, but I needed to accommodate paging issues for large set of results.
The most useful thing it would be an example in the readme. without that nobody will even know it exists
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
New feature
The recent collaboration between
NCBI
and the cloud providers allows one to query the entire archive based on the metadata in AWS Athena.Here are some relevant resources for the same
https://www.ncbi.nlm.nih.gov/sra/docs/sra-cloud/
https://registry.opendata.aws/ncbi-sra/
https://www.ncbi.nlm.nih.gov/sra/docs/sra-athena-examples/
https://www.youtube.com/playlist?list=PLH-TjWpFfWrt5MNqU7Jvsk73QefO3ADwD
NOTE: The same could be done for GCP cloud as well, for now I've not created a separate issue for that.
Suggested implementation
I'm sure there must be a more elegant implementation but as an initial draft for this implementation, we could implement this in a couple of ways
fromNCBI
, which allows one to pass a closure based query for any particular database from NCBI.fromSRA
method, which allows a closure to be passed to thequery
field. For example,Related https://github.com/nextflow-io/nextflow/issues/1605