wurmlab / sequenceserver

Intuitive graphical web interface for running BLAST bioinformatics tool (i.e. have your own custom NCBI BLAST site!)
https://sequenceserver.com
GNU Affero General Public License v3.0
268 stars 111 forks source link

exporting large numbers of hit sequences creates too long a command-line #751

Closed yannickwurm closed 2 months ago

yannickwurm commented 2 months ago

Using -entry when you have tens of thousands of hits can lead the export process to crash.

We should refactor https://github.com/wurmlab/sequenceserver/blob/1463d0570b93d6754af6c18c1e926fe3cbd632f4/lib/sequenceserver/sequence.rb#L198 by writing identifiers to a file and using -entry_batch instead.

*** Retrieval options
 -entry <String>
   Comma-delimited search string(s) of sequence identifiers:
    e.g.: 555, AC147927, 'gnl|dbname|tag', or 'all' to select all
    sequences in the database
    * Incompatible with:  entry_batch, ipg, ipg_batch, info, metadata,
   tax_info, taxids, taxidlist, no_taxid_expansion, list, recursive,
   remove_redundant_dbs, list_outfmt, show_blastdb_search_path
 -entry_batch <File_In>
   Input file for batch processing (Format: one entry per line, seq id 
   followed by optional space-delimited specifier(s)
   [range|strand|mask_algo_id]
    * Incompatible with:  entry, range, strand, mask_sequence_with, ipg,
   ipg_batch, info, metadata, tax_info, taxids, taxidlist, no_taxid_expansion,
   list, recursive, remove_redundant_dbs, list_outfmt,
   show_blastdb_search_path