Closed Wanli-HE closed 10 months ago
Hello,
I believe you're using an older version of dRep that has an incompatibility with newer versions of pandas; please upgrade dRep to the newest version and this error will go away.
Best, Matt
Hello,
I believe you're using an older version of dRep that has an incompatibility with newer versions of pandas; please upgrade dRep to the newest version and this error will go away.
Best, Matt
Hello,
I believe you're using an older version of dRep that has an incompatibility with newer versions of pandas; please upgrade dRep to the newest version and this error will go away.
Best, Matt
Hi Matt!
thanks, it`s works,
another question, how can i get the information about clusering of every bins, it seems like only get represent bins of each cluster. for instance, bins1, bin2 belong to which cluster.
best, wanli
Hello,
That information is located in the file Cdb.csv
in the data_tables
output folder.
Best, Matt
Hello,
That information is located in the file
Cdb.csv
in thedata_tables
output folder.Best, Matt
hI AGAIN!
I checked that file, one thing i am not sure, i have over 3000 bins, but half of that was clustered into a group called: root (UID1), ,
is that normal? and all the paremater i used are defeault.
best, wanli
I’ve never seen that before. Could you let me know the parameters you ran dRep with and show me the top of that file? (Be nice to know what the headers are)
I’ve never seen that before. Could you let me know the parameters you ran dRep with and show me the top of that file? (Be nice to know what the headers are)
Hi! here is the commond line: dRep dereplicate vamb_drep_res -p 40 -pa 0.9 -sa 0.95 -nc 0.3 -cm larger --S_algorithm fastANI -g vamb_res_fa_file/*
and here is the results: Chdb.csv
the second one is using defaults parameter. but it seems like same to the first one. [Uploading Chdb (2).csv…]()
so i am a little bit confused, why the result look like that! 3400 bins but just clustered into 74 groups. especially when i used gdtbtk to annotate all bins, i get 465 in total species.
Ah I see- you're looking at Chdb.csv
, which contains taxonomy information. The clustering information is in the file Cdb.csv
Ah I see- you're looking at
Chdb.csv
, which contains taxonomy information. The clustering information is in the fileCdb.csv
Hi! that will more not make sense, there only 35 represents genome, with default parameter, but gtdbtk have over 400 specise.
besides, i want to know the information of which bin belong to which cluster, of all bins. do drep really have the summary files?
best, wanli
Ah I see- you're looking at
Chdb.csv
, which contains taxonomy information. The clustering information is in the fileCdb.csv
Hi! that will more not make sense, there only 35 represents genome, with default parameter, but gtdbtk have over 400 specise.
besides, i want to know the information of which bin belong to which cluster, of all bins. do drep really have the summary files?
best, wanli
here is another i tried, using ANI > 0.98, but still only have 34 clusters,
Hi!
i used drep recently, and i had met some issues.
the issue is:
the commond is: "dRep dereplicate vamb_drep_res -p 40 -pa 0.9 -sa 0.95 -nc 0.3 -cm larger --S_algorithm fastANI -g vamb_res_fa_file/* " version: lastest
the workflow works: """There are the columns: ['genome', 'completeness', 'contamination', 'strain_heterogeneity'] Filtering genomes 1.01% of genomes passed checkM filtering Storing resulting files
Running primary clustering Running pair-wise MASH clustering Clustering MASH database """