wurmlab / sequenceserver

Intuitive graphical web interface for running BLAST bioinformatics tool (i.e. have your own custom NCBI BLAST site!)
GNU Affero General Public License v3.0
268 stars 111 forks source link

exporting large numbers of hit sequences creates too long a command-line #751

Closed yannickwurm closed 2 months ago

yannickwurm commented 2 months ago

Using -entry when you have tens of thousands of hits can lead the export process to crash.

We should refactor https://github.com/wurmlab/sequenceserver/blob/1463d0570b93d6754af6c18c1e926fe3cbd632f4/lib/sequenceserver/sequence.rb#L198 by writing identifiers to a file and using -entry_batch instead.

*** Retrieval options
 -entry <String>
   Comma-delimited search string(s) of sequence identifiers:
    e.g.: 555, AC147927, 'gnl|dbname|tag', or 'all' to select all
    sequences in the database
    * Incompatible with:  entry_batch, ipg, ipg_batch, info, metadata,
   tax_info, taxids, taxidlist, no_taxid_expansion, list, recursive,
   remove_redundant_dbs, list_outfmt, show_blastdb_search_path
 -entry_batch <File_In>
   Input file for batch processing (Format: one entry per line, seq id 
   followed by optional space-delimited specifier(s)
    * Incompatible with:  entry, range, strand, mask_sequence_with, ipg,
   ipg_batch, info, metadata, tax_info, taxids, taxidlist, no_taxid_expansion,
   list, recursive, remove_redundant_dbs, list_outfmt,