qiita-spots / qiita

Qiita - A multi-omics databasing effort
http://qiita.microbio.me
BSD 3-Clause "New" or "Revised" License
120 stars 80 forks source link

Show links to all the sequence files of artifacts used in an analysis #534

Closed ElDeveloper closed 4 years ago

ElDeveloper commented 9 years ago

Ideally being able to grab the sequences from a meta-analysis.

ElDeveloper commented 9 years ago

@rob-knight mentioned that this is highly needed as it will reduce the number of requests from collaborators, however this will not be part of the release because as @antgonza pointed out we are not storing the sequences for any of the EMP studies, we are only loading the OTU tables. This should be available soon but not for the initial release, moving to the alpha milestone.

rob-knight commented 9 years ago

Correct. Priority is:

  1. Give access to the EMP tables, allow collaborators to get serious about the metadata cleanup effort. This will also give the developers a much needed break.
  2. Public release, includes ability to upload/process seqs, 454 data, etc.
  3. Other markets, multi-omics, visualization improvements etc.

Thanks everyone for all the hard work and glad we are getting close...

Rob

On Oct 29, 2014, at 5:30 PM, Yoshiki Vázquez Baeza notifications@github.com<mailto:notifications@github.com> wrote:

@rob-knighthttps://github.com/rob-knight mentioned that this is highly needed as it will reduce the number of requests from collaborators, however this will not be part of the release because as @antgonzahttps://github.com/antgonza pointed out we are not storing the sequences for any of the EMP studies, we are only loading the OTU tables. This should be available soon but not for the initial release, moving to the alpha milestone.

— Reply to this email directly or view it on GitHubhttps://github.com/biocore/qiita/issues/534#issuecomment-61023224.

wasade commented 9 years ago

We should be in a good position with the hdf5 demux for this. I think we'll want to send out gzip'd fastq though, but the translation mechanisms are already in place. We might be able to do this without using the filesystem as a go-between too by just streaming which would be really awesome.

antgonza commented 9 years ago

This is already solved.

adamrp commented 9 years ago

I think this issue had more to do with the ability to download a single sequence file that represented the entire meta-analysis (which I don't think we have currently implemented, unless I missed something)

On Tue, Nov 25, 2014 at 11:42 AM, Antonio Gonzalez <notifications@github.com

wrote:

Closed #534 https://github.com/biocore/qiita/issues/534.

— Reply to this email directly or view it on GitHub https://github.com/biocore/qiita/issues/534#event-198460216.

antgonza commented 9 years ago

Got it!

antgonza commented 5 years ago

This issue has been changing focus during it's life; the latests is to download all the sequences from a meta-analysis. IMOO this has been surpassed as a requirement for the system + the BIOM Qiita artifact stores deblur sequences as their ids for a meta-analysis of deblured sequences; in other words, the biom table has the sequences as ids; thus, allowing to have all sequences from the study. Additionally, now an analysis builds a tree on the fly, when possible. Anyway, I think this can be closed.

@ElDeveloper, what do you think?

ElDeveloper commented 5 years ago

Although I don't have this use-case, I believe users could still want to use the sequences for an alternative sequence clustering method. Which is not something you would want to do from deblurred sequences.

antgonza commented 5 years ago

OK, I would not recommend users to do that but sounds like a fair use case, what about showing in the analysis a link to the raw data that generated each artifact being used? This will allow users to select which raw data they want to download ...

ElDeveloper commented 5 years ago

Yes, I think that would likely be helpful. Only missing point to something like that would be to also provide the preparation files so that the sequences could be processed as needed with any barcode/primer/linker information.

antgonza commented 5 years ago

Got it, thanks! Should we provide prep information or full qiime1 mapping files?

ElDeveloper commented 5 years ago

Come to think of it, that information would already be included as part of the mapping file used for analysis. Therefore linking to all the sequence files of studies (where sequences can be shared) would probably be sufficient.

antgonza commented 4 years ago

The analysis/meta-analysis page contains a summary of all the studies used to create it. Each row contains a link to the study page, which then provides a single link to download all the raw data, if the owner allowed to download the raw data, and/or bioms, with sample and prep information files. Thus, a user can download with 2 clicks each of the studies raw data for any given meta-analysis.

Example here; taken from: https://qiita.ucsd.edu/analysis/description/15093/ studies_public_analysis

Thus, closing.