ncbi / fcs

Foreign Contamination Screening caller scripts and documentation
Other
88 stars 12 forks source link

[FEATURE REQUEST]: Option to disable validate_fasta in FCS-adaptor #25

Closed brantfaircloth closed 1 year ago

brantfaircloth commented 1 year ago

Is this a feature request for FCS-adaptor or FCS-GX?

FCS-adaptor

Describe the problem you'd like to be solved

validate_fasta fails in circumstances where contigs contain a significant number of Ns. for example, I have scaffolds generated using optical mapping that contain significant numbers of N bases within gaps. These gaps have a size estimated based off of the optical map. However, large gaps can trigger an error when validate_fasta runs, which causes the remainder of the pipeline to run.

Describe the solution you'd like

An option to disable validate_fasta (with the default being that validate_fasta is enabled).

Describe alternatives you've considered

Removing scaffolds that fail to validate and re-running and/or splitting scaffolds on the gaps and running components pieces - but this can be difficult and potentially error-prone.

Thanks much for both FSC-adaptor and FCS-GX - they're working well!

pstrope commented 1 year ago

Hi Brant,

Which version of fcs-adaptor are you using? You can find this at the top of the runner script. We had modified validate_fasta to not error out on too many N's in v0.3.0.

Good to hear that both programs are working well for you!

thanks, Pooja

brantfaircloth commented 1 year ago

Hi Pooja,

Thanks for the quick response! It appears that I am using v0.3.0...

> head -n 10 run_fcsadaptor.sh

#!/bin/bash

SCRIPT_NAME=$0
DEFAULT_VERSION="0.3.0"
DOCKER_IMAGE=ncbi/fcs-adaptor:${DEFAULT_VERSION}
SINGULARITY_IMAGE=""
CONTAINER_ENGINE="docker"

usage()
{
pstrope commented 1 year ago

Are you using docker or singularity?

brantfaircloth commented 1 year ago

I'm using Singularity:

curl https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/FCS/releases/0.2.3/fcs-adaptor.0.2.3.sif -Lo fcs-adaptor.sif

I see what happened... I just followed the instructions on the wiki and pulled down the 0.2.3 SIF image rather than the 0.3.0 image that I see at that URL (which I am assuming contained the corrected workflow).

pstrope commented 1 year ago

You are right! I updated the docs just now. And you should get the newest sif. Let us know if that works to solve your problem with N's.

brantfaircloth commented 1 year ago

Thanks - will do!