EducationalTestingService / skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.
http://skll.readthedocs.org
Other
550 stars 69 forks source link

Exception while running a classification #268

Closed ecvgit closed 8 years ago

ecvgit commented 8 years ago

I am running a classification problem. The input is a file in ARFF format. The file works fine in Weka. The class distribution is a bit skewed. Is this a bug?

Traceback (most recent call last):
  File "/usr/local/bin/run_experiment", line 9, in <module>
    load_entry_point('skll==1.1.1', 'console_scripts', 'run_experiment')()
  File "/usr/local/lib/python2.7/dist-packages/skll/utilities/run_experiment.py", line 108, in main
    ablation=ablation, resume=args.resume)
  File "/usr/local/lib/python2.7/dist-packages/skll/experiments.py", line 848, in run_configuration
    _classify_featureset(job_args)
  File "/usr/local/lib/python2.7/dist-packages/skll/experiments.py", line 446, in _classify_featureset
    grid_jobs=grid_search_jobs)
  File "/usr/local/lib/python2.7/dist-packages/skll/learner.py", line 1417, in cross_validate
    stratified else KFold(len(examples.labels),
  File "/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 441, in __init__
    test_split = test_split[test_split < len(label_test_folds)]
TypeError: object of type 'numpy.int64' has no len()
aoifecahill commented 8 years ago

Could you share the configuration that you used?

ecvgit commented 8 years ago
[General]
experiment_name = COMP_CV
task = cross_validate

[Input]
train_directory = train
featuresets = [["outall_fixed_wv_with_u_removed.arff"]]
learners = ["RandomForestClassifier", "DecisionTreeClassifier", "SVC", "MultinomialNB"]
label_col = risk_type
id_col = ID

[Tuning]
grid_search = true
objective = accuracy

[Output]
log = output
results = output
predictions = output
aoifecahill commented 8 years ago

Hmm, I've been able to run a cross validation experiment with some of the sample weka files and did not see that exception.

How skewed is your distribution? I'm not sure that this would even cause that exception though.

desilinguist commented 8 years ago

@ecvgit are you still experiencing this issue?

desilinguist commented 8 years ago

I am going to close this for now. Please feel free to comment if you are still experiencing this issue and I will reopen.