rs-delve / covid19_datasets

Interfacing several COVID-19 related datasets
MIT License
45 stars 19 forks source link

xlrd > 2.0.0 only supports .xls files #25

Open MJHutchinson opened 3 years ago

MJHutchinson commented 3 years ago

The requirement xlrd > 1.0.0 now causes issues with the new release of xlrd as the latest version only supports xls files.

Best fix is to use openpyxl (https://openpyxl.readthedocs.io/en/stable/) instead

ywteh commented 3 years ago

I think I may have fixed that problem, and may not need xlrd anymore. -yw

On Fri, Feb 26, 2021 at 1:07 PM Michael Hutchinson notifications@github.com wrote:

The requirement xlrd > 1.0.0 now causes issues with the new release of xlrd as the latest version only supports xls files.

Best fix is to use openpyxl (https://openpyxl.readthedocs.io/en/stable/) instead

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rs-delve/covid19_datasets/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABKM6ULRHLNSALUN5CJ3G3TA6MHDANCNFSM4YING55Q .

MJHutchinson commented 3 years ago

A clean install of the library threw the following error

Traceback (most recent call last):
  File "/data/ziz/not-backed-up/mhutchin/Rmap-dev/Rmap/dataprocessing/process_uk_cases.py", line 5, in <module>
    uk_data = UKCovid19Data(england_area_type=UKCovid19Data.ENGLAND_LOWER_TIER_AUTHORITY)
  File "/data/ziz/not-backed-up/mhutchin/Rmap-dev/Rmap/dataprocessing/covid19_datasets/covid19_datasets/uk_area_stats.py", line 132, in __init__
    UKCovid19Data.wales_cases_data, UKCovid19Data.wales_tests_data = _load_wales_datasets()
  File "/data/ziz/not-backed-up/mhutchin/Rmap-dev/Rmap/dataprocessing/covid19_datasets/covid19_datasets/uk_area_stats.py", line 60, in _load_wales_datasets
    xlsx = pd.ExcelFile(WALES_PATH)
  File "/data/ziz/mhutchin/miniconda3/envs/Rmap2/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 1102, in __init__
    raise ValueError(
ValueError: Your version of xlrd is 2.0.1. In xlrd >= 2.0, only the xls format is supported. Install openpyxl instead.

so maybe the requirements file needs updating?

ywteh commented 3 years ago

i will look into this and do a pull request. -yw

On Fri, Feb 26, 2021 at 2:47 PM Michael Hutchinson notifications@github.com wrote:

A clean install of the library threw the following error Traceback (most recent call last): File "/data/ziz/not-backed-up/mhutchin/Rmap-dev/Rmap/dataprocessing/process_uk_cases.py", line 5, in uk_data = UKCovid19Data(england_area_type=UKCovid19Data.ENGLAND_LOWER_TIER_AUTHORITY) File "/data/ziz/not-backed-up/mhutchin/Rmap-dev/Rmap/dataprocessing/covid19_datasets/covid19_datasets/uk_area_stats.py", line 132, in init UKCovid19Data.wales_cases_data, UKCovid19Data.wales_tests_data = _load_wales_datasets() File "/data/ziz/not-backed-up/mhutchin/Rmap-dev/Rmap/dataprocessing/covid19_datasets/covid19_datasets/uk_area_stats.py", line 60, in _load_wales_datasets xlsx = pd.ExcelFile(WALES_PATH) File "/data/ziz/mhutchin/miniconda3/envs/Rmap2/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 1102, in init raise ValueError( ValueError: Your version of xlrd is 2.0.1. In xlrd >= 2.0, only the xls format is supported. Install openpyxl instead.

so maybe the requirements file needs updating?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rs-delve/covid19_datasets/issues/25#issuecomment-786691050, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABKM6X4XDS3BROEVSV5CDTTA6X57ANCNFSM4YING55Q .

upaq commented 3 years ago

should the requirements be updated by removing xlrd >= 1.0.0 and adding openpyxl >= 3.0.0 ?

ywteh commented 3 years ago

Was looking into what I did. That might just be in (installing openpyxl and removing xlrd). Did not change code it seems.

On Fri, 26 Feb 2021 at 4:44 pm, upaq notifications@github.com wrote:

should the requirements be updated by removing xlrd >= 1.0.0 and adding openpyxl

= 3.0.0 ?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/rs-delve/covid19_datasets/issues/25#issuecomment-786760528, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABKM6UHGUKIPYREMFUGLDDTA7FXTANCNFSM4YING55Q .

williamse497 commented 3 years ago

You need to make sure that you are on a recent version of Pandas; at least 1.0.1, 1.2.0..., then install openpyxl: pip install openpyxl or Go to-> https://openpyxl.readthedocs.io/en/stable/ On your Pandas Code base that read: ->pandas.read_excel('cat.xlsx') Change it to: pandas.read_excel('cat.xlsx', engine='openpyxl')

Or you can inatall the older version of xlrd: pip install xlrd==1.2.0

Because the latest version of xlrd(2.0.1) only support xls files extensions