tmadl / semisup-learn

Semi-supervised learning frameworks for python, which allow fitting scikit-learn classifiers to partially labeled data
MIT License
502 stars 153 forks source link

Can't get the dataset "Lung cancer (Ontario)". #5

Open Sier-xyt opened 7 years ago

Sier-xyt commented 7 years ago

it has an error when running the example.py

urllib2.HTTPError: HTTP Error 404: Dataset 'lung-cancer-ontario' not found on mldata.org.

shubhamjn1 commented 6 years ago

Yes, I also encountered a similar error. You can try using another dataset for example "heart".

sxy77 commented 6 years ago

Since the lung-cancer-ontario are not found, I use colon-cancer instead. But I found the following error. I thought it might cause by size of data-set but I don't know how to fix it. Any suggestions please? Traceback (most recent call last): File "D:/semisup-learn-master/examples/example.py", line 23, in random_labeled_points = random.sample(np.where(ytrue == 0)[0], labeled_N/2)+\ File "C:\Python27\lib\random.py", line 325, in sample raise ValueError("sample larger than population") ValueError: sample larger than population

tongliuTL commented 6 years ago

Same issue here. Would like to have any help, or workaround to get the data. Thanks!

Follow-up question: HDF5 format is available at http://mldata.org/repository/data/viewslug/lung-cancer-ontario/ though. Is there any way to convert it to the expected .mat format?