Closed pavaris-pm closed 7 months ago
Hello @pavaris-pm! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:
Hello! Thank you for your pull request. Can you add filter the word that start with #
?
Hello! Thank you for your pull request. Can you add filter the word that start with
#
?
Sure. Did you mean add a parameters for user to control whether to return a corpus with the text starts with # or not right? by True if you want a returned corpus including words starts with #, and returned the corpus with filtered out word starts with # (no word start with # in corpus) otherwise.
Hello! Thank you for your pull request. Can you add filter the word that start with
#
?Sure. Did you mean add a parameters for user to control whether to return a corpus with the text starts with # or not right? by True if you want a returned corpus including words starts with #, and returned the corpus with filtered out word starts with # (no word start with # in corpus) otherwise.
Yes 👍
Hello! Thank you for your pull request. Can you add filter the word that start with
#
?Sure. Did you mean add a parameters for user to control whether to return a corpus with the text starts with # or not right? by True if you want a returned corpus including words starts with #, and returned the corpus with filtered out word starts with # (no word start with # in corpus) otherwise.
I think we can do this in get_corpus()
.
Maybe add the boolean parameter discard_comments
to get_corpus()
?
The default is probably False
.
Or, we can utilize the existing Python standard library shlex
for this. shlex will ignore comment lines when it gets its input.
Hello! Thank you for your pull request. Can you add filter the word that start with
#
?Sure. Did you mean add a parameters for user to control whether to return a corpus with the text starts with # or not right? by True if you want a returned corpus including words starts with #, and returned the corpus with filtered out word starts with # (no word start with # in corpus) otherwise.
I think we can do this in
get_corpus()
.Maybe add the boolean parameter
discard_comments
toget_corpus()
? The default is probablyFalse
.Or, we can utilize the existing Python standard library
shlex
for this. shlex will ignore comment lines when it gets its input.
@bact @wannaphong i already add comment filtering by adding a new parameters named discard_comments
where the default value is set to be False
. You can review the code from the latest commit krub
Hello! Thank you for your pull request. Can you add filter the word that start with
#
?Sure. Did you mean add a parameters for user to control whether to return a corpus with the text starts with # or not right? by True if you want a returned corpus including words starts with #, and returned the corpus with filtered out word starts with # (no word start with # in corpus) otherwise.
I think we can do this in
get_corpus()
.Maybe add the boolean parameter
discard_comments
toget_corpus()
? The default is probablyFalse
.Or, we can utilize the existing Python standard library
shlex
for this. shlex will ignore comment lines when it gets its input.
@bact @wannaphong I've made some experiment to test the discard_comments
parameters and fix some bugs from it. Now it works perfectly. feel free to review from now on krub. It's done 💯
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
Merged thank you.
What does this changes
@wannaphong @bact from issue #877 since ICU are included to almost all web browser, i've added ICU dictionary to PyThaiNLP where file of ICU dictionary are named as
icubrk_th.txt
and their python file to load the corpus are named asthai_icu.py
krub.Will resolve #877
Your checklist for this pull request
🚨Please review the guidelines for contributing to this repository.