Open Maarten-vd-Sande opened 3 years ago
This is good idea! Paging @bscrow if he is willing to help, else I will add it to my list of ToDos.
Based on the current implementation, we can exclude private samples from the search by adding "AND Public[Access]" to the search text. For eg, "bacterium AND nanopore" -> "bacterium AND nanopore AND Public[Access]". dbgap samples can be excluded by appending "NOT cluster dbgap[Properties]"
If this is a frequently used feature, maybe we can implement a separate flag for this functionality
My guess would be that this is something many people would be interested in. Does this then already work with the Python API? Something like:
from pysradb.search import SraSearch
instance = SraSearch(query="h3k27ac AND Public[Access]", selection="chip", publication_date="01-01-2015:01-01-2021", platform="illumina", organism="Homo sapiens", verbosity=2, return_max=1_000_000)
instance.search()
When I use this query I get less experiments, which sort-of indicates it works?
p.s. so far the search functionality works really nice, thanks for the feature @bscrow
Yep that will work by keeping only entries with publically available data. From what I can see on SRA it's slightly different from excluding dbgap samples as some dbgap samples are classified as public.
I'm really glad that you like the search functionality! I'll try to see if I can implement the option to exclude dbgap samples from a search.
Is your feature request related to a problem? Please describe. Not sure if it is already implemented, but would it be possible to exclude private/dbgap samples when using the search functionality?
If not yet supported, that would be a very useful functionality for me :)