I ran meshclust on a very small set of E coli genomes (4 genomes) using the following command:
meshclust -d PATH_TO_FILE/file.fa -o PATH_TO_OUTPUTDIR/mesh_clust_threshold_AUTO.txt -c 16 -a y
The program ran for some time training Identity but found a threshold of -0.007409239 after which it crashed due to the negative threshold. For now my workaround is to guesstimate an appropriate threshold which I can use for further experiments, but this might be something worth looking into.
Output generated by meshclust:
Cores: 16
Estimating the threshold ...
Average: 4544141
K: 11
Histogram size: 4194304
A histogram entry is 32 bits.
Generating data.
Number of standard deviations: 2
Preparing data ...
Positive examples: 10000
Training size: 5000
Validation size: 5000
Better performance of: 5.69984e-05
sim_ratio
Better performance of: 4.55067e-06
sim_ratio
correlation^2
Better performance of: 1.10609e-06
sim_ratio
simMM
correlation^2
minkowski x sim_ratio
minkowski x sim_ratio^2
Better performance of: 8.40882e-07
minkowski
sim_ratio
simMM
d2_star
correlation^2
chebyshev x d2_star
minkowski x sim_ratio
minkowski x sim_ratio^2
Better performance of: 6.94492e-07
minkowski
jeffrey_divergence
sim_ratio
simMM
d2_star
correlation^2
chebyshev x jeffrey_divergence
chebyshev x d2_star
minkowski x sim_ratio
minkowski x sim_ratio^2
chebyshev^2 x minkowski^2
Selected statistics:
minkowski
jeffrey_divergence
sim_ratio
simMM
d2_star
correlation^2
chebyshev x jeffrey_divergence
chebyshev x d2_star
minkowski x sim_ratio
minkowski x sim_ratio^2
chebyshev^2 x minkowski^2
Finished training.
MAE: 0.000665851
MSE: 6.94492e-07
Optimizing ...
Validating ...
MAE: 0.000666787
MSE: 6.86473e-07
Mean = 0.707903
STD = 0.411681
Min = -0.00740926
============================================
-0.00740926
Final threshold: -0.00740926
I ran meshclust on a very small set of E coli genomes (4 genomes) using the following command:
meshclust -d PATH_TO_FILE/file.fa -o PATH_TO_OUTPUTDIR/mesh_clust_threshold_AUTO.txt -c 16 -a y
The program ran for some time training Identity but found a threshold of -0.007409239 after which it crashed due to the negative threshold. For now my workaround is to guesstimate an appropriate threshold which I can use for further experiments, but this might be something worth looking into.
Output generated by meshclust: