leekgroup / recount

R package for the recount2 project. Documentation website: http://leekgroup.github.io/recount/
https://jhubiostatistics.shinyapps.io/recount/
40 stars 9 forks source link

Search by GSE #11

Closed assaron closed 3 years ago

assaron commented 7 years ago

Hi, is the abstract searching the canonical way to load experiment by GSE ID? If yes, there is a better way: you could download GEO metadata file: https://ftp.ncbi.nlm.nih.gov/sra/reports/Metadata/SRA_Accessions.tab and look up GSE to SRP mapping there. I think it would be a super useful feature.

lcolladotor commented 7 years ago

Interesting, though searching via GEO should also point you to the SRP id. For example, searching GEO for GSE69351 leads to https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE69351 which includes the SRP id in the bottom part of the page.

screen shot 2017-07-13 at 3 55 38 pm

The abstract search we provide is very rudimentary and we are not really trying to re-create all the possible ways you can find SRP ids.

assaron commented 7 years ago

I just thought that as you provide a table with abstracts in your package it also would be nice to include references from other databases directly, as an additiononal column in abstracts table or any other standrartd way, whatever is the easiest.

lcolladotor commented 3 years ago

I'm closing this issue since we have now released recount3 and we don't really plan on updating recount2. The search functionality though is still limited. This suggestion by @assaron is still interesting for recount3.

@ChristopherWilks, do we have the GSE IDs on recount3 for the samples downloaded from SRA? I think that the answer might be yes though not on the metadata files we load. Like, I see some info under sra.study_title which would be enough for searching by text, though it's not it's own separate column.

Screen Shot 2020-12-17 at 7 13 03 PM

At the sample level, I see sra.sample_name (ran the first example at http://research.libd.org/recount3/reference/create_rse.html).

Screen Shot 2020-12-17 at 7 10 23 PM
ChristopherWilks commented 3 years ago

We discussed having GEO/Pubmed linked in at one point earlier this year (IIRC), and I did work on linking from the original SRA human v3 set, but didn't release anything. So any GEO related accessions are whatever was in the SRA metadata parsed out of the original submitted XMLs as a by product, not intentionally.

It's something that could be revisited, I can give what I've done so far to someone else. It might be a reasonable first project for a new student to work on.

lcolladotor commented 3 years ago

I see, thanks for the clarification Chris!