PyThaiNLP / pythainlp

Thai Natural Language Processing in Python.
https://pythainlp.org/
Apache License 2.0
936 stars 272 forks source link

Fix empty string ('') added (in some cases) when using word_tokenize with join_broken_num=True #912

Closed S2P2 closed 1 month ago

S2P2 commented 1 month ago

fix empty string bug

What does this changes

pythainlp/tokenize/_utils.py : add the statement if connected_token : to check before appending connected_token to tokens_joined

What was wrong

https://github.com/PyThaiNLP/pythainlp/issues/911

How this fixes it

tokens_joined won't be appended by empty string anymore

Your checklist for this pull request

🚨Please review the guidelines for contributing to this repository.

pep8speaks commented 1 month ago

Hello @S2P2! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2024-05-10 10:31:30 UTC
sonarcloud[bot] commented 1 month ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

coveralls commented 1 month ago

Coverage Status

coverage: 79.093% (+0.03%) from 79.063% when pulling dcd2b47018daab3893d05194e8c90cc0d5c9602a on S2P2:fix-join-broken-num into a38fd5e84148402929c861c7e49afd0c5a08abfb on PyThaiNLP:dev.

bact commented 1 month ago

Thank you.