ababaian / serratus

Ultra-deep search for novel viruses
http://serratus.io
GNU General Public License v3.0
250 stars 32 forks source link

Adding new column to the Serratus search? #275

Closed mmsonder closed 8 months ago

mmsonder commented 8 months ago

Hi, normally we have columns below on Serratus search on PgAdmin. My question about is it possible to add new column to the Serratus search on PgAdmin? Because I need to add "SRA Title" column to search specific terms for filtering the results according to the SRA title. Do you know is it possible and how to do this?

Available Columns in Serratus run_id: SRA run accession for the analysis represented in the summary file family_name: Name of the family of the pan-genome that is being analyzed sequence_accession: Name of the accession coverage_bins: Coverage cartoon generated, giving a picture of the quality of alignment throughout the specific sequence score: Score given for the quality of the alignment percent_identity: Percent identity of the sequences aligned (wrt the reference genome) depth: Depth of the sequence n_reads: Number of aligned reads n_global_reads: Number of global aligned reads (excludes soft-clipped reads) length: Length of accession virus_name: Study name linking to the accession

ababaian commented 8 months ago

In general no it's not possible to change the schema for an SQL table easily. Each of our tables represents a very specific input file format for the whole database to work. If you want to search for "SRA Title" I'd recommend using the SRA web interface here to generate a list of SRA run identifiers (SRRxxxxx, ERRxxxxx, or DRRxxxxx, ...) and intersecting that with the run_id column in SQL.

If you want to retrieve the SRA title for the output from a list, you may have to create an efetch script like so: https://www.ncbi.nlm.nih.gov/books/NBK242621/#_SRA_Download_Guid_BK_Downloading_metadat_

Hope that helps.