lisphilar / covid19-sir

CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.
https://lisphilar.github.io/covid19-sir/
Apache License 2.0
110 stars 44 forks source link

[Fix] UnicodeEncodeError when automated downloading #1160

Closed shik-design closed 2 years ago

shik-design commented 2 years ago

I tried to run the just released Covsirphy 2.26.0 snr = cs.ODEScenario.auto_build(geo="Japan", model=cs.SIRFModel) But it is giving an error

runfile('C:/Users/USER/.spyder-py3/temp.py', wdir='C:/Users/USER/.spyder-py3')
Retrieving COVID-19 dataset from https://github.com/lisphilar/covid19-sir/data/
Traceback (most recent call last):
  File ~\.spyder-py3\temp.py:3 in <module>
    snr = cs.ODEScenario.auto_build(geo="Japan", model=cs.SIRFModel)
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\science\ode_scenario.py:154 in auto_build
    engineer.download(
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\engineering\engineer.py:89 in download
    df = downloader.layer(**validator.kwargs(DataDownloader.layer, default=default_dict))
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\downloader.py:110 in layer
    new_df = db.layer(country=country, province=province).convert_dtypes()
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\db.py:65 in layer
    return self._country()
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\db_cs_japan.py:61 in _country
    df = self._provide(url=self.URL_C, suffix="", columns=cols, date="Date", date_format="%Y-%m-%d")
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\db.py:142 in _provide
    df = self._provider.latest(filename=filename, url=url, columns=columns, date=date, date_format=date_format)
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\provider.py:52 in latest
    df = self.read_csv(url, columns, date=date, date_format=date_format)
  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\provider.py:106 in read_csv
    df = datatable.fread(path, header=True).to_pandas()
  File ~\anaconda3\envs\VC\lib\urllib\request.py:275 in urlretrieve
    reporthook(blocknum, bs, size)
  File ~\anaconda3\envs\VC\lib\encodings\cp1252.py:19 in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 11-13: character maps to <undefined>
lisphilar commented 2 years ago

Are you using covisirphy with Python 3.10 and dev version of datatable? Actually, I could not reproduce this error...

lisphilar commented 2 years ago

Does the following script cause error now?

import covsirphy as cs
dl = cs.DataDownloader()
dl.layer()
shik-design commented 2 years ago
dl.layer()
Retrieving COVID-19 dataset from https://github.com/lisphilar/covid19-sir/data/
Traceback (most recent call last):

  Input In [4] in <cell line: 1>
    dl.layer()

  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\downloader.py:110 in layer
    new_df = db.layer(country=country, province=province).convert_dtypes()

  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\db.py:65 in layer
    return self._country()

  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\db_cs_japan.py:61 in _country
    df = self._provide(url=self.URL_C, suffix="", columns=cols, date="Date", date_format="%Y-%m-%d")

  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\db.py:142 in _provide
    df = self._provider.latest(filename=filename, url=url, columns=columns, date=date, date_format=date_format)

  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\provider.py:52 in latest
    df = self.read_csv(url, columns, date=date, date_format=date_format)

  File ~\anaconda3\envs\VC\lib\site-packages\covsirphy\downloading\provider.py:106 in read_csv
    df = datatable.fread(path, header=True).to_pandas()

  File ~\anaconda3\envs\VC\lib\urllib\request.py:275 in urlretrieve
    reporthook(blocknum, bs, size)

  File ~\anaconda3\envs\VC\lib\encodings\cp1252.py:19 in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]

UnicodeEncodeError: 'charmap' codec can't encode characters in position 11-13: character maps to <undefined>

The error persisted

lisphilar commented 2 years ago

Because the error says "'charmap' codec can't encode characters", I tried to fix the problem with encoding="utf-8" via #1193. Could you uninstall the current version and reinstallation of covsirphy development version 2.26.2-alpha?

pip install --upgrade "git+https://github.com/lisphilar/covid19-sir.git#egg=covsirphy"
lisphilar commented 2 years ago

Version numbering was incorrect and the correct version is 2.26.3-alpha after https://github.com/lisphilar/covid19-sir/commit/557b622f9d6a1636e0c33736979ab1a893df8fe4.

lisphilar commented 2 years ago

Bumped up to version 2.27.0-alpha, skipping 2.26.3 so that include dropping Python 3.7 support with #1146.

lisphilar commented 2 years ago

I will close this issue because the change will be included in 2.27.0 release today, but we can continue this discussion if necessary.