NBChub / bgcflow

Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)
https://github.com/NBChub/bgcflow/wiki
MIT License
34 stars 9 forks source link

Analyse subsequence of genome #302

Closed FriederikeBiermann closed 4 weeks ago

FriederikeBiermann commented 11 months ago

Hi I'd like to inquire if bgcflow allows for the analysis of a specific region within a contig. For instance, I'm interested in examining the sequence NT_187580.1, spanning positions 169942 to 188315. Is this feasible using only the symples.csv file, or is it necessary to download all the corresponding snippets as FASTA files from NCBI and incorporate them into the CSV as custom entries?

I appreciate your assistance!

Best regards, Frida

matinnuhamunada commented 11 months ago

Hi Frida, thanks for the inquiries :)

I think this request is too specific to be incorporated in the current workflow. For now, I would go with your second suggestion, where you download the corresponding snippets as fasta file (.fna), then use the custom input type in the samples.csv.

I will put this in the wish list of feature to be added in future release, something similar to antismash --start and --end option.

FriederikeBiermann commented 11 months ago

Thank you :)