ammaraziz / flukit

GNU Lesser General Public License v3.0
2 stars 2 forks source link

subcommand find/rename #6

Open ammaraziz opened 1 year ago

ammaraziz commented 1 year ago

Subcommand for finding fasta files specific to a batch.

Example usage:

flukit find \
    --input-dir {Path} \
    --input-meta {tsv or csv} \ 
    --batch-num {num} \
    --output-dir {Path} \
    --split-by gene

Operation:

  1. Get a list all fasta files from input directory
  2. Parse meta file to extract Seq No
  3. Match lists, subset, get complete path
  4. Concat files together and write out
  5. If split-by is specified, split as appropriate
  6. Create folder in output-dir with named batch-number
  7. Write out fasta files

Default paths to search in:

Considerations:

ammaraziz commented 1 year ago

For the rename branch to be merged into main, I need to fix or add the following: