Plant-Food-Research-Open / assemblyqc

A NextFlow pipeline for evaluating assembly quality
https://plant-food-research-open.github.io/assemblyqc/
MIT License
13 stars 2 forks source link

Auto remove the contamination? #19

Open GallVp opened 1 year ago

GallVp commented 1 year ago

It's not that easy. People might make a mistake on the TRIM action: https://github.com/ncbi/fcs/wiki/FCS-adaptor#rules-for-action-assignment

GallVp commented 1 year ago

Add a responsibility statement.

GallVp commented 2 months ago

Please see point 3 here: https://github.com/ncbi/fcs/wiki/FCS-GX-quickstart#usage-examples

By default the NCBI GX tool hard masks internal contaminants. Hard masking already has multiple meaning. First, it can represent a gap. Second, it can represent hard masking by a tool such as repeat masker. Now with this, it can also mean hard masking due to contamination.

Some suggestions:

  1. Adaptors should be removed by the assembler software.
  2. Foreign organism contamination should be removed following the recommendations from NCBI FCS GX (https://github.com/ncbi/fcs/wiki/FCS-GX-output#fcs-gx-action-report), except for the FIX action for internal contaminations. Instead of hard masking the contamination, we should insert 100 N's there to denote a gap of unknown size.