Closed gayu-thri closed 11 months ago
Following up on this as the suggested changes are already made few weeks back and PR is not merged yet.
If there are anymore changes that has to be made before merging, please let me know regarding the same.
After reviewing this PR, we have decided not to merge it for the following reasons:
Thank you for your effort — we look forward to future contributions.
Thank you for your effort — we look forward to future contributions.
Thanks. Sure.
- The grammar provided offers functionality that can already be obtained through the whitelist class by adding (keyword, transformation) pairs to the whitelist data file.
I'd like to clarify this. Isn't profanity filtering a different kind of transformation which is not applicable to all whitelisted words?
Of course, we could add on a pre-defined list of pairs with both spoken and written form (filtered version) to the whitelist.
But if it has to be handled in grammar-level, wouldn't maintaining a separate classifier be better?
What does this PR do ?
This PR adds a new feature in ITN - EN for filtering profane words. With this, profane words in the input text would be redacted with
*
symbol.Before your PR is "Ready for review"
Pre checks:
git commit -s
to sign.pytest
or (if your machine does not have GPU)pytest --cpu
from the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')
). 2) Sparrowhawk testsbash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
pytest
and Sparrowhawk here.__init__.py
for every folder and subfolder, includingdata
folder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
to all newly added Python files?Copyright 2015 and onwards Google, Inc.
. See an example here.try import: ... except: ...
) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.