Recomendation for AT-high and very repetitive genome

Hello @ruanjue !

We have to assemble a genome with the following features:

Genome size: ~2Gb
High repetitiveness: ~78.68% (inferred from a close species)
Hi AT content: ~72.36% (inferred from a close species)
PacBio HiFi sequencing: 130x coverage

We've checked some of the issues, such as #239 and we thought about the following tests. So, we'd like to ask your opinion or recommendations about them. All of them have two steps: (a) assembly with wtdbg2; (b) polishing with wtdbg-cns (minimap2 is used as a read mapper).

List of possible tests/ideas:

1) Default options:

wtdbg2 -g 2.05g -t 24 -x sq
wtdbg2 -g 2.05g -t 24 -x ccs

2) Add -R parameter:

wtdbg2 -g 2.05g -t 24 -x sq -R

3) Increase -s parameter (0.5 or 0.7 as in #239): wtdbg2 -g 2.05g -t 24 -x sq -s 0.5

4) Try to vary -L, increasing it to keep the longest reads.

Any advice would be appreciated, Thanks in advance

ruanjue / wtdbg2

Recomendation for AT-high and very repetitive genome #261