Closed albertocarpentieri closed 2 years ago
Hi, I do not understand the second comment, can you please explain it to me again? Furthermore, I received the email about the tests and I do not understand why it did not pass some of the tests.
Let me know,
Alberto Il venerdì 24 settembre 2021, 21:30:44 CEST, Joan Fontanals @.***> ha scritto:
@JoanFM commented on this pull request.
Looks like a really good PR, I have some small comments
In jinahub/segmenters/Sentencizer/sentencizer.py:
@@ -82,6 +85,32 @@ def init( r'\s([^{0}]+)(?<!\s)[{0}]'.format(''.join(set(self.punct_chars))) )
I think is better to make them private
In jinahub/segmenters/Sentencizer/sentencizer.py:
@@ -35,6 +36,8 @@ def init( :param punct_chars: the punctuation characters to split on, whatever is in the list will be used, for example ['!', '.', '?'] will use '!', '.' and '?'
- If smart_tokenizer=True is passed to segment, punct_chars is
I think it would be good to have an init parameter that indeed can be overriden by parameters
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
The idea is that u are reading the parameters from the parameters in the call. but normally there is an object parameter determining the default behavior
Ok thanks, I will add to my local branch this modification. I will send another pull request asap!
Il lunedì 27 settembre 2021, 13:26:13 CEST, Joan Fontanals ***@***.***> ha scritto:
The idea is that u are reading the parameters from the parameters in the call. but normally there is an object parameter determining the default behavior
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Hey @albertocarpentieri
Any update on this?
Hi Cristian, it was a PR for the recruitment process. But when I'm free from work I'll continue on that.
Il venerdì 8 ottobre 2021, 14:11:39 CEST, cristian ***@***.***> ha scritto:
Hey @albertocarpentieri
Any update on this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Hey @albertocarpentieri
Thanks for the initiative.
However, we will close this PR for now, as it's been left unattended for some time. If you want to continue the work, let us know and we can re-open it.
Add of smart_tokenizer parameter to segment function of Sentencizer executor in order to not consider dots in abbreviations. The feature uses nltk.tokenize.sent_tokenize, for which a new requirement is inserted.