iamaziz / PyDataset

Instant access to many datasets in Python.
MIT License
934 stars 87 forks source link

Unable to load datasets (Python 3.5.1 under Anaconda, Win 7) #1

Closed coryandrewtaylor closed 8 years ago

coryandrewtaylor commented 8 years ago

Hi,

I'm unable to load datasets in Python 3.5.1, Win 7. I can install pydataset, import it, and view available datasets just fine. However, when I try to load datasets, I get an error saying that I have the wrong name for the dataset. For example:

In [1]: iris= data('iris')
Traceback (most recent call last):

  File "<ipython-input-3-f894fb655dca>", line 1, in <module>
    cake = data("cake", show_doc=True)

  File "C:\Users\ctaylor\AppData\Local\Continuum\Anaconda3\lib\site-packages\pydataset\__init__.py", line 36, in data
    raise Exception('Wrong dataset name! Try: data() to see available.')

Exception: Wrong dataset name! Try: data() to see available.
iamaziz commented 8 years ago

Hey thanks for reporting this. Unfortunately, I don't have access to a windows machine at the moment :/ But it seems a unicode-related error in the string you passed to data().

Since the file already exists (otherwise it would be an OSError) but the passed string didn't match its actual dataset_id name (although they appear with the same letters), the only thing I could think of is the character properties.

Could you verify the string type before you pass it to data()? Hope if someone has tried this on Win could bring some more insight into this as well.

I tried to produce this error by changing the string to unicode('utf-8'), and got the same error:

>>> data('iris'.encode('utf-8'))
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    data('iris'.encode('utf-8'))
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pydataset/__init__.py", line 36,
 in data
    raise Exception('Wrong dataset name! Try: data() to see available.')
Exception: Wrong dataset name! Try: data() to see available.
rafguns commented 8 years ago

From a quick look, it seems like this is caused by the forward slashes in https://github.com/iamaziz/PyDataset/blob/master/pydataset/locate_datasets.py#L39 and further. Separator on Windows is r'\' so these should be replaced by os.path.sep.

iamaziz commented 8 years ago

Yep that's was the issue thanks @rafguns It should be ok now, please update to 0.1.1 and check it out.

coryandrewtaylor commented 8 years ago

It's working as expected now. Thanks!

On Tue, Feb 2, 2016 at 12:30 PM, Aziz Alto notifications@github.com wrote:

Closed #1 https://github.com/iamaziz/PyDataset/issues/1.

— Reply to this email directly or view it on GitHub https://github.com/iamaziz/PyDataset/issues/1#event-536287584.

nikolas1301 commented 6 years ago

I'm having the same problem as cory. I'm using python 3.6.2 and win 10.

`Traceback (most recent call last): File "C:\Users\nikol\PycharmProjects\MachineLearning3\venv\lib\site-packages\pydataset__init.py", line 34, in data df = read_csv(item) File "C:\Users\nikol\PycharmProjects\MachineLearning3\venv\lib\site-packages\pydataset\datasets_handler.py", line 47, in read_csv path = __get_csv_path(item) File "C:\Users\nikol\PycharmProjects\MachineLearning3\venv\lib\site-packages\pydataset\datasets_handler.py", line 43, in get_csv_path return items[item] KeyError: 'Titanic'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:/Users/nikol/PycharmProjects/MachineLearning3/AulaPydata.py", line 7, in titanic = pydataset.data('Titanic') File "C:\Users\nikol\PycharmProjects\MachineLearning3\venv\lib\site-packages\pydataset__init__.py", line 37, in data find_similar(item) File "C:\Users\nikol\PycharmProjects\MachineLearning3\venv\lib\site-packages\pydataset\support.py", line 48, in find_similar raise Exception(ERROR) Exception: Not valid dataset name and no similar found! Try: data() to see available.`

Sandy4321 commented 4 years ago

so there is solution?