steineggerlab / conterminator

Detection of incorrectly labeled sequences across kingdoms
GNU General Public License v3.0
77 stars 7 forks source link

Conterminating the output of DIAMOND deepclust #23

Open LotharukpongJS opened 1 year ago

LotharukpongJS commented 1 year ago

Dear @martin-steinegger ,

I really like the approach taken in conterminator and would like to apply it in my dataset, which is the output of diamond deepclust. However, I am unsure whether the diamond deepclust be used as an input to conterminator. Could you let me know whether this could be done?

The example output of diamond deepclust on the example sequences, i.e. diamond deepclust -d example/prots.fas -o dmnd.deepclust.out --approx-id 30, is as follows.

A5GQF2.1        B1X4Q7.1
A5GQF2.1        A5GQF2.1
A5GND2.1        P12409.1
A5GND2.1        A5GND2.1
A5GND2.1        B1X3Y2.1
A5GND2.1        P08445.1
Q3AJZ0.2        B1X3H9.1
Q3AJZ0.2        Q7U6W0.1
Q3AJZ0.2        Q3AJZ0.2
P20403.1        B2ZCQ2.1
P20403.1        B2ZCQ1.1
P20403.1        B2ZCQ0.1
P20403.1        P20403.1

(@RocesV , maybe this question could be interesting for you too.)

Best regards, Sodai