Closed gunjanpandey closed 1 year ago
Hello,
This is expected behavior. The TRIM calls are >100 bp from the ends of sequences, and therefore are not removed automatically. See https://github.com/ncbi/fcs/wiki/FCS-adaptor#rules-for-action-assignment.
The rationale for this behavior is due to the uncertainty of the proper corrective action. Adaptor sequences in the middle of contigs could be the result of false contig joins, in which case the best action would be to split the contig into two at the adaptor site. If it is not a false contig join, then one could simply mask that portion of the sequence. I suggest looking at the regions and the surrounding sequence more closely to determine the best action for your use case.
Closing. Please follow-up if you have other questions.
I ran the following command
run_fcsadaptor.sh --fasta-input ${genome} --output-dir adaptor_out/ --image /apps/fcs-genome/0.2.2/dist/fcs-adaptor.sif --container-engine singularity --euk
and got the following result #accession length action range name contig_14529 37558 ACTION_TRIM 37184..37244 CONTAMINATION_SOURCE_TYPE_ADAPTOR:NGB01054.1:Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT26 contig_2502 251119 ACTION_TRIM 106358..106405 CONTAMINATION_SOURCE_TYPE_ADAPTOR:NGB01054.1:Rubicon Genomics ThruPLEX DNA-seq single-index iPCRtagT26
However, the input
${genome}
and the output fileadaptor_out/cleaned_sequences/${genome}
are identical - no timming happened. Script finished successfully.