ncbi / fcs

Foreign Contamination Screening caller scripts and documentation
Other
88 stars 12 forks source link

Extract fasta sequences #13

Closed gunjanpandey closed 1 year ago

gunjanpandey commented 1 year ago

how can I extract fasta sequences from this database, pls?

etvedte commented 1 year ago

Hello,

Can you please clarify this request?

Are you using FCS-adaptor or FCS-GX? What sequences are you trying to retrieve?

rahulvrane commented 1 year ago

re-iterating above question - yes indeed, if there was a way to extract the fasta sequences from this database (FCS-GX), it would be great since we want to use it for other applications for consistency.

etvedte commented 1 year ago

Hello,

We are discussing as a team various database delivery methods. Can you briefly describe your specific application?

rahulvrane commented 1 year ago

One of the examples stems from our interest in hologenome databases. Therefore for every contaminant identified by fcs-gx, we want to make bloom filters, kraken databases to fish out raw reads that will be used for targeted reassembly.

pstrope commented 1 year ago

Hi, are you using Singularity or Docker image for FCS-GX?

tillenglert commented 1 year ago

I'm currently also looking for a way to modify or extract the sequences from the fcs-gx db. Is there a db browser or any tool which I could use to open the db file? Also do you think it is possible to create an even smaller test db, or do you think the 5Gb is pretty much the smallest database possible for testing?

Thank you in advance!

pstrope commented 1 year ago

Hi,

We are planning to release a new version of the software where you will be able to extract the fasta sequences from the gx db. We've made some significant changes to the software and are testing. Once that's done we will release.

Thank you! Pooja

pstrope commented 1 year ago

We've updated the software and scripts to v0.3.0. We've added this section to the docs: https://github.com/ncbi/fcs/wiki/FCS-GX#useful-gx-subcommands That section explains how you can extract sequences from the gxdb. Hope this helps!

Pooja

pstrope commented 1 year ago

Closing. Please follow-up if you have other questions.