EveryVoiceTTS / EveryVoice

The EveryVoice TTS Toolkit - Text To Speech for your language
https://docs.everyvoice.ca
Other
23 stars 2 forks source link

Dev.ej/460 #578

Closed joanise closed 2 weeks ago

joanise commented 3 weeks ago

PR Goal?

Fix the bug where a literal / in text would cause a crash, by replacing it with _SLASH_ when the characters are joined with /.

Fixes?

Fixes #460 Fixes #540

Feedback sought?

This _SLASH_ is not just in internal state, it's also saved in preprocessed/filelist.psv, which means it's ultimately user visible. Because of that, I would like a strong agreement on what string to use here. Is _SLASH_ OK or should I use <SLASH> or something else? I'm not worried about collisions, but whatever choice we make now will be permanent because changing it in the future will be a breaking change.

Beyond that, regular sanity checking.

Priority?

beta

Tests added?

yes

How to test?

Follow https://github.com/EveryVoiceTTS/EveryVoice/issues/460#issuecomment-2455849389 and see that everyvoice preprocess doesn't crash.

Or use everyvoice/tests/data/metadata_slash_pipe.psv in the wizard and see that everyvoice preprocess doesn't crash.

Confidence?

High for the coding, low for the choice of substitution string.

Version change?

no

Related PRs?

none

semanticdiff-com[bot] commented 3 weeks ago

Review changes with  SemanticDiff

Changed Files
| File | Status | | :--- | :--- | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/preprocessor/preprocessor.py)  [everyvoice/preprocessor/preprocessor\.py](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/preprocessor/preprocessor.py) | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/preprocessor/preprocessor.py)  55% smaller | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/text/text_processor.py)  [everyvoice/text/text\_processor\.py](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/text/text_processor.py) | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/text/text_processor.py)  45% smaller | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/test_text.py)  [everyvoice/tests/test\_text\.py](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/test_text.py) | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/test_text.py)  15% smaller | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#.github/workflows/docs.yml)  [\.github/workflows/docs\.yml](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#.github/workflows/docs.yml) | Unsupported file format | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#.github/workflows/publish.yaml)  [\.github/workflows/publish\.yaml](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#.github/workflows/publish.yaml) | Unsupported file format | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#.github/workflows/test.yml)  [\.github/workflows/test\.yml](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#.github/workflows/test.yml) | Unsupported file format | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/cli.py)  [everyvoice/cli\.py](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/cli.py) | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/cli.py)  0% smaller | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/data/metadata_slash_pipe.psv)  [everyvoice/tests/data/metadata\_slash\_pipe\.psv](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/data/metadata_slash_pipe.psv) | Unsupported file format | | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/test_preprocessing.py)  [everyvoice/tests/test\_preprocessing\.py](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/test_preprocessing.py) | [](https://app.semanticdiff.com/gh/EveryVoiceTTS/EveryVoice/pull/578/changes#everyvoice/tests/test_preprocessing.py)  0% smaller |
github-actions[bot] commented 3 weeks ago
CLI load time: 0:00.30
Pull Request HEAD: 2e4cc9b3f77367fd032eeeca1e2360e2149b854e
Imports that take more than 0.1 s:
import time: self [us] | cumulative | imported package
codecov[bot] commented 3 weeks ago

Codecov Report

Attention: Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 76.58%. Comparing base (49025c0) to head (2e4cc9b). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
everyvoice/preprocessor/preprocessor.py 66.66% 0 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #578 +/- ## ========================================== + Coverage 76.40% 76.58% +0.18% ========================================== Files 46 46 Lines 3450 3451 +1 Branches 472 470 -2 ========================================== + Hits 2636 2643 +7 + Misses 710 706 -4 + Partials 104 102 -2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

marctessier commented 3 weeks ago

I ran some tests with some custom data , this does fix the bug while preprocessing. I also did a quick training run with success . No issues detected.

joanise commented 3 weeks ago

Decision, in consultation with @SamuelLarkin : since we already use <SIL> in similar contexts, let's use <SLASH> to maintain better consistency.