A fix for issue #166 reported for Italian TN. Fixes the bug where the sentence-final period in sentences ending with domain is incorrectly normalized as part of the domain. The PR also includes support for social media tags, updated tests and a fix to test_sparrowhawk_normalization.sh which blocked Sparrowhawk testing.
Before your PR is "Ready for review"
Pre checks:
[x] Have you signed your commits? Use git commit -s to sign.
[x] Do all unittests finish successfully before sending PR?
1) pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
2) Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
[x] If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
[x] Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
[ ] Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
[x] Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
What does this PR do ?
A fix for issue #166 reported for Italian TN. Fixes the bug where the sentence-final period in sentences ending with domain is incorrectly normalized as part of the domain. The PR also includes support for social media tags, updated tests and a fix to
test_sparrowhawk_normalization.sh
which blocked Sparrowhawk testing.Before your PR is "Ready for review"
Pre checks:
git commit -s
to sign.pytest
or (if your machine does not have GPU)pytest --cpu
from the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')
). 2) Sparrowhawk testsbash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
pytest
and Sparrowhawk here.__init__.py
for every folder and subfolder, includingdata
folder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
to all newly added Python files?Copyright 2015 and onwards Google, Inc.
. See an example here.try import: ... except: ...
) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.