Closed francescocaponio closed 1 year ago
Hi there,
So I looked up what the standard is and pip docs state The following:
Requirements files are utf-8 encoding by default and also support PEP 263 style comments to change the encoding (i.e. # -*- coding: <encoding name> -*-).
I feel we should expand to support the full PEP then if we want to be complete. There are libraries to help with this, I recommend using maybe charset-normalizer.
A fix has been merged, and I will try release in next few days.
I would like to see if we can remove the added encodeing.py file and use a library we can keep up to date.
if not, just state here why, but I feel charset-normalizer feels like a maintained version of this. pip
tried to not use dependencies (or vendors them) as it is the tool that installs the dependencies in most cases ....
I understand, I also was unsure on how to proceed, I feel like stepping some other people's home by sumbitting PRs. That's why I was asking your confirmation over there.
I wanted to find a solution to make it more reliable, but not sure if this is "future safe".
I have this exception
Opening the file with several text files showed normal text content, until I opened it with an hex editor:
The file has been created on a Ubuntu machine with pip freeze command, the same that usually generates other requirements files in UTF-8, don't really know why that time it was generated in UTF-16 encoding.
Edit: I was remembering it was Ubuntu, but after some search, this error happens with pip freeze > requirements.txt on powershell, thus was a commit made on windows.
Could it be possible to at least not stop the processing, skipping this file, or try different encoding?
My fear is that, since it runs on the background, unattended, if file in such way happen, could stop the packages sync process, creating a mess for the CI/CD pipelines relying on this packages local cache.
Edit: I will make a PR trying to fix or at least skip the problematic file. I made an issue mostly to understand why it never happened to anyone before to have this encoding on file, and then I discovered it is common when using powershell redirect operator.