rseng / pdf-generator

Generate a pdf rendering with a GitHub action, optionally with a web interface.
4 stars 3 forks source link

decoding issues because of missing locales #6

Open Chilipp opened 4 years ago

Chilipp commented 4 years ago

as mentioned in https://github.com/SORSE/sorse.github.io/pull/320, the pdf-generator apparently uses python 2.7 for the ob-paper command. You can reproduce the error with the following steps:

@vsoch: the issue is in the ob-paper command. Try the following

git clone https://github.com/jcohen02/sorse.github.io.git -b event/2020-09-29
cd sorse.github.io
git checkout d545f979fc85e47f32ed6a8d52750c9b9ff8fd38
docker run -it --entrypoint ob-paper -v `pwd`:/github/workspace rseng/pdf-generator get _events/talks/event-ID_UNKNOWN.md title

when I run these commands, I am getting the error below, caused by the signs that I replaced in https://github.com/SORSE/sorse.github.io/pull/320/commits/53a6dd75763a5ed9de9a785eaa266be2dc837773

Error message ``` Traceback (most recent call last): File "/usr/local/bin/ob-paper", line 11, in load_entry_point('openbases==0.0.55', 'console_scripts', 'ob-paper')() File "/usr/local/lib/python3.6/dist-packages/openbases/cli/papers/__init__.py", line 106, in main main(args=args, options=options, parser=parser) File "/usr/local/lib/python3.6/dist-packages/openbases/cli/papers/get.py", line 26, in main paper = cli.paper(command[0], quiet=quiet) File "/usr/local/lib/python3.6/dist-packages/openbases/main/papers/__init__.py", line 31, in __init__ self.metadata = read_frontmatter(filename, quiet=quiet) File "/usr/local/lib/python3.6/dist-packages/openbases/utils/fileio.py", line 198, in read_frontmatter stream = read_file(filename, mode, readlines=False) File "/usr/local/lib/python3.6/dist-packages/openbases/utils/fileio.py", line 148, in read_file content = filey.read() File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 40: ordinal not in range(128) ```

running docker run -it --entrypoint python -vpwd:/github/workspace rseng/pdf-generator --version gives Python 2.7.17

vsoch commented 4 years ago

@Chilipp in your error message I only see python3. The container entrypoint may not be python3, but the error trace is using it (3.6).

vsoch commented 4 years ago

The openbases module is installed with pip3, see https://github.com/rseng/pdf-generator/blob/master/Dockerfile#L29.

Chilipp commented 4 years ago

@Chilipp in your error message I only see python3

True, sorry :sweat_smile:

It's strange, because this seems to be an issue with the docker container.

When I run

docker run -it --entrypoint python3 -v `pwd`:/github/workspace rseng/pdf-generator -c "open('_events/talks/event-ID_UNKNOWN.md').read()"

I am getting the above-mentioned error. Can you reproduce this @vsoch?

When I use my local python3.6 installation instead (i.e. outside of the docker container), python3 -c "open('_events/talks/event-ID_UNKNOWN.md').read()" I don't get any problems at all. Don't know, what the issue is here. One possibility (but this would mean to adapt the read_file function, would be to specify an encoding. In other words. This here seems to work:

docker run -it --entrypoint python3 -v `pwd`:/github/workspace rseng/pdf-generator -c "open('_events/talks/event-ID_UNKNOWN.md', encoding='utf-8').read()"
Chilipp commented 4 years ago

found a fix! we just need to install the locales in the docker container, i.e. add something like

RUN apt install locales && locale-gen en_US en_US.UTF-8 && dpkg-reconfigure locales

to the Dockerfile. Then it works

vsoch commented 4 years ago

Ah let's give that a shot!

vsoch commented 4 years ago

okay here is a branch to try! https://github.com/rseng/pdf-generator/pull/7. If that works, I'll merge and draft a release.