bioinfo-ut / PhenotypeSeeker

Identify phenotype-specific k-mers and predict phenotype using sequenced bacterial strains
GNU General Public License v3.0
18 stars 10 forks source link

phenotype seeker prediction caused Value error by sklearn #1

Closed aresDeathscythe closed 6 years ago

aresDeathscythe commented 6 years ago

Hi,

I created my own model for phenotype prediction with phenotypeseeker modeling (all with default parameters). If I now start the prediction with the output of the modelling (Did everything like it is described on the main page) I get following error:

File "/usr/local/bin/phenotypeseeker", line 213, in Main() File "/usr/local/bin/phenotypeseeker", line 207, in Main args.func(args) File "/usr/local/lib/python2.7/dist-packages/PhenotypeSeeker/prediction.py", line 147, in prediction predict(samples_order, phenotypes_to_predict) File "/usr/local/lib/python2.7/dist-packages/PhenotypeSeeker/prediction.py", line 117, in predict predictions = model.predict(kmers_presence_matrix) File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/metaestimators.py", line 115, in out = lambda *args, *kwargs: self.fn(obj, args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_search.py", line 468, in predict return self.bestestimator.predict(X) File "/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/base.py", line 324, in predict scores = self.decision_function(X) File "/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/base.py", line 300, in decision_function X = check_array(X, accept_sparse='csr') File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 433, in check_array array = np.array(array, dtype=dtype, order=order, copy=copy) ValueError: setting an array element with a sequence.

Thanks for helping.

erkiaun commented 6 years ago

The "PhenotypeSeeker prediction" was written according to gmer_caller version, which had header in the output file. However, the gmer_caller version in the bin folder gives output without header.

Fixed the "PhenotypeSeeker prediction" to work with either version of gmer_caller (the program that "PhenotypeSeeker prediction" uses to fast check the k-mers presence in samples).