best-of-lists / best-of-generator

🏆 Generates a ranked list of awesome libraries and tools.
https://best-of.org
GNU General Public License v3.0
70 stars 12 forks source link

Non-English description will be empty #61

Closed YDX-2147483647 closed 1 year ago

YDX-2147483647 commented 1 year ago

Describe the bug:

If a project's description is not English, best-of will think it's all of special characters and remove anything.

(see remove_special_chars called in process_description)

https://github.com/best-of-lists/best-of-generator/blob/a0c6d8a37b57e5cfe8a6d684ac312605d94bbedc/src/best_of/utils.py#L32-L56

Expected behaviour:

Steps to reproduce the issue:

Original description: (from BIThesis)

📖 北京理工大学非官方 LaTeX 模板集合,包含本科、研究生毕业设计模板及更多。🎉 (更多文档请访问 wiki 和 release 中的手册) 

Result: (in best-of-bits)

LaTeX wiki release.

Technical details:

Possible Fix:

Additional context:

YDX-2147483647 commented 1 year ago

I am working on it.

Config Description Default
ascii_description If True, all non-ASCII characters in the project description will be removed. Useful for filtering out distractive emoji, but hurtful in non-English cases. (Note: GitHub emoji commands (e.g. :smile:) are always removed.) True