mx-psi / fa-scraper

A FilmAffinity web scraper compatible with Letterboxd
GNU General Public License v3.0
17 stars 9 forks source link

Bug parsing uncommon character #119

Open erikskauch opened 5 months ago

erikskauch commented 5 months ago

Hi,

While trying to scrap the data for this film https://www.filmaffinity.com/es/film152926.html, this exception was thrown:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\xxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\Scripts\fa-scrapper.exe\__main__.py", line 7, in <module>
  File "C:\Users\xxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\fa_scrapper\cli.py", line 78, in main
    save_to_csv(data, fieldnames, export_file)
  File "C:\Users\xxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\fa_scrapper\fa_scrapper.py", line 218, in save_to_csv
    writer.writerow(d)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.496.0_x64__qbz5n2kfra8p0\Lib\csv.py", line 164, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.496.0_x64__qbz5n2kfra8p0\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 40: character maps to <undefined>

which I guess it's for the character in director's name Shin'ichirō Watanabe: I've tried to delete my vote in filmaffinity and re-process it and because it wasn't there anymore to scrap, it worked correctly.

Anyway, thank you very much for your amazing work

mx-psi commented 5 months ago

:thinking: Interesting, I do test with an account that has this very character https://github.com/mx-psi/fa-scraper/blob/1499c164ae1a29ee32cf4bf072f0758d2ec5689f/testdata/expected-en.csv#L2 and it works fine on the CI tests, so I am not sure why it's failing on your environment

mx-psi commented 1 month ago

@erikskauch Would you mind telling me what fa-scraper version you were using? (The --version flag tells you that)