borenstein-lab / MUSiCC

MUSiCC: A marker genes based framework for metagenomic normalization and accurate profiling of gene abundances in the microbiome
BSD 3-Clause "New" or "Revised" License
15 stars 5 forks source link

Unable to run cross validation step for learning lasso model #6

Open GeraldineKong opened 5 years ago

GeraldineKong commented 5 years ago

Hi there, I was trying to run MUSiCC on my data and I run into a strange problem. Below is the command line I used: $ run_musicc.py ko_tp12_musicc_TSS.tsv -o ko_tp12_musicc_norm.tsv -n -c learn_model -perf -v

And even though it previously said it could detect 18 samples, the process failed in the end saying that my number of samples = 0.

Running MUSiCC... Input: ko_tp12_musicc_TSS.tsv Output: ko_tp12_musicc_norm.tsv Normalize: True Correct: learn_model Compute scores: True Loading data using pandas module... 18 samples and 208 genes Done. Performing MUSiCC Correction... Learning sample-specific models .Traceback (most recent call last): File "/home/geraldine/miniconda3/bin/run_musicc.py", line 26, in correct_and_normalize(vars(given_args)) File "/home/geraldine/miniconda3/lib/python3.6/site-packages/musicc/core.py", line 344, in correct_and_normalize final_model, all_samples_mean_scores[s] = learn_lasso_model(final_covariates, final_response) File "/home/geraldine/miniconda3/lib/python3.6/site-packages/musicc/core.py", line 35, in learn_lasso_model k_fold = cross_validation.KFold(len(res_train), n_folds=num_cv, shuffle=True) File "/home/geraldine/miniconda3/lib/python3.6/site-packages/sklearn/cross_validation.py", line 337, in init super(KFold, self).init(n, n_folds, shuffle, random_state) File "/home/geraldine/miniconda3/lib/python3.6/site-packages/sklearn/cross_validation.py", line 262, in init " than the number of samples: {1}.").format(n_folds, n)) ValueError: Cannot have number of folds n_folds=5 greater than the number of samples: 0.

Thanks for your time!