BBaloglu / ASHURE

Python-based pipeline for analyzing Nanopore sequencing metabarcoding data
GNU Lesser General Public License v3.0
18 stars 3 forks source link

clst_folder (-f) type in clst #6

Open adriantich opened 3 months ago

adriantich commented 3 months ago

when using ashure with the module clst, the parameter -f type is set as integer and not str as it should. (l. 936 of src/ashure.py)

adriantich commented 1 week ago

found the same for the -ts parameter. this happens when entering the parameters manually instead of using the config.json. I used "-ts 4" and I got the following error:

pid[2611] 2024-10-24 13:15:09.706 INFO: Running kmeans with n_clusters = 4.0
Traceback (most recent call last):
  File "/app/lib/ASHURE/src/ashure.py", line 1169, in <module>
    main()
  File "/app/lib/ASHURE/src/ashure.py", line 1157, in main
    data, flist = perform_cluster(df, df_d, max_iter=config['clst_N_iter'], csize=config['clst_csize'], N=config['clst_N'], th_s=config['clst_th_s'], th_m=config['clst_th_m'], pw_config=config['clst_pw_config'], msa_config=config['msa_config'], workspace=config['clst_folder'], track_file=config['clst_iter_out'], timestamp=True)
  File "/app/lib/ASHURE/src/ashure.py", line 567, in perform_cluster
    df_c = cluster_sweep(df_q, df_c, th_s, N, csize, pw_config, msa_config, workspace)
  File "/app/lib/ASHURE/src/ashure.py", line 602, in cluster_sweep
    x = bpy.cluster_Kmeans(df_align[['id','m1']], n_clusters=th_s, n_init=10, n_iter=100, ordered=True)
  File "/app/lib/ASHURE/src/bilge_pype.py", line 1566, in cluster_Kmeans
    clust.fit(x)
  File "/root/miniforge3/envs/ashure/lib/python3.9/site-packages/sklearn/base.py", line 1466, in wrapper
    estimator._validate_params()
  File "/root/miniforge3/envs/ashure/lib/python3.9/site-packages/sklearn/base.py", line 666, in _validate_params
    validate_parameter_constraints(
  File "/root/miniforge3/envs/ashure/lib/python3.9/site-packages/sklearn/utils/_param_validation.py", line 95, in validate_parameter_constraints
    raise InvalidParameterError(
sklearn.utils._param_validation.InvalidParameterError: The 'n_clusters' parameter of MiniBatchKMeans must be an int in the range [1, inf). Got 4.0 instead.

In this case it should be read as int and not float