Closed pohlt closed 4 years ago
I just saw that you actually let ptable do the table output. How about switching to ptable2 which is still actively developed and should have a compatible API. They might already have fixed this issue.
@pohlt Thanks for the report. Probably you are talking about " 🐓 " emoji.
In my Linux environment I got the following output:
$ pip-licenses -d --format=rest
+------+---------+---------+-----------------------------------------+
| Name | Version | License | Description |
+------+---------+---------+-----------------------------------------+
| poyo | 0.5.0 | MIT | A lightweight YAML Parser for Python. ? |
+------+---------+---------+-----------------------------------------+
How is it broken in your environment?
How about switching to ptable2 which is still actively developed and should have a compatible API. They might already have fixed this issue.
If switching solves this issue, it's an attractive proposition.
However, there are many forked implementations of ptable on PyPI. Making a better choice among them is a difficult decision for me 😔
Under Windows 10 (Python 3.7) I get
+------+---------+---------+-----------------------------------------+
| Name | Version | License | Description |
+------+---------+---------+-----------------------------------------+
| poyo | 0.5.0 | MIT | A lightweight YAML Parser for Python. 🐓 | <= under Win10, I see a chicken here and the table edge is shifted
+------+---------+---------+-----------------------------------------+
which is one character too much.
I agree: It's difficult to decide which prettytable fork is currently best maintained. PTable hasn't seen a commit for two years now and I'm worried it's basically dead.
I'll check if ptable really causes the problem and if ptable2 would fix it.
There is an open issue https://github.com/kxxoling/PTable/issues/20 which describes the same problem. No reaction for 2.5 years.
@pohlt Thanks for the working example on Windows 10.
It was reproduced in my macOS environment. Also, I tried installing source code for ptable2 from the latest master branch. This issue has not been resolved.
If this issue can be solved by using ptable2 in the future, it is better to implement a compatibility layer like P-R in the past.
See also) #52
And when I'm sure ptable2 is stable and well maintained, I plan on migrating from PTable.
After some testing and internal discussions, I would now propose to filter the data which is retrieved from pypi (enabled with a flag).
Rationale:
"Hello 🐓 World".encode("latin1", errors="ignore").decode("utf-8")
latin1
could be a configuration setting.
Would you be willing to consider a PR?
Patch is welcome!! 😄
Ok, now we just need someone who generates the actual PR! ;-)
I'd like to get your input/guidance before starting. What I plan to do:
filter-strings
command line option for turning on filtering (default: off)filter-codec
command line option for selecting a code page from this list (default: latin1)Does that make sense to you? And don't get your hopes up: I cannot commit to if and when this is happening. :-)
@pohlt All right, I'll express my opinion.
- Add boolean
filter-strings
command line option for turning on filtering (default: off)
I think it's fine to adopt it as an opt-out or default behavior.
This is because we already do the equivalent when we load the package license file as a stream. https://github.com/raimon49/pip-licenses/blob/v-2.2.1/piplicenses.py#L148-L149
Note: However, we have adopted errors="backslashreplace"
instead of errors="ignore"
and replaced it with code points.
- Add some code to do the actual filtering (for all strings retrieved from pip)
Yes, I agree.
- Add string
filter-codec
command line option for selecting a code page from this list (default: latin1)- Add docs and tests
I agree with default: latin1
.
Testing and documenting that users can specify the filter-codec
command line option is going to be a challenge. If you can, make it a goal.
This issue has been resolved in pip-licenses 2.3.0 😄
Some packages have unusual characters in their description (e.g. poyo). These characters causes ReST tables to break their column alignment.
Filtering the input strings for regular alpha-numerical characters would fix this, I guess.