Open kristinakordova opened 1 year ago
To make the graph, DBGWAS uses GATB, unitig-caller uses bifrost -- I would not guarantee that these graphs are identical. I don't know if the default k-mer length of both tools is the same. If you wanted to compare more thoroughly, I would suggest running bifrost and bcalm on your dataset. I don't think that unitig caller should be doing any additional filtering.
I am running
unitig-caller --call --reads input_reads.txt --out output_folder --threads 76 --pyseer
and
./DBGWAS -strains input_strains.txt -keepNA -output output_folder -nb-cores 76
the two input files have the same assembled genomes and NA as phenotype. I was expecting to get an identical number of nodes in the graph but I am getting a mismatch of a few million - 2,251,639 (Uniting-caller) and 7,022,727 (DBGWAS). Does Uniting-caller have a filtering threshold? Where does the difference come from?