Closed joanise closed 2 weeks ago
CLI load time: 0:00.30
Pull Request HEAD: 2e4cc9b3f77367fd032eeeca1e2360e2149b854e
Imports that take more than 0.1 s:
import time: self [us] | cumulative | imported package
Attention: Patch coverage is 90.00000%
with 1 line
in your changes missing coverage. Please review.
Project coverage is 76.58%. Comparing base (
49025c0
) to head (2e4cc9b
). Report is 2 commits behind head on main.
Files with missing lines | Patch % | Lines |
---|---|---|
everyvoice/preprocessor/preprocessor.py | 66.66% | 0 Missing and 1 partial :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I ran some tests with some custom data , this does fix the bug while preprocessing. I also did a quick training run with success . No issues detected.
Decision, in consultation with @SamuelLarkin : since we already use <SIL>
in similar contexts, let's use <SLASH>
to maintain better consistency.
PR Goal?
Fix the bug where a literal
/
in text would cause a crash, by replacing it with_SLASH_
when the characters are joined with/
.Fixes?
Fixes #460 Fixes #540
Feedback sought?
This
_SLASH_
is not just in internal state, it's also saved inpreprocessed/filelist.psv
, which means it's ultimately user visible. Because of that, I would like a strong agreement on what string to use here. Is_SLASH_
OK or should I use<SLASH>
or something else? I'm not worried about collisions, but whatever choice we make now will be permanent because changing it in the future will be a breaking change.Beyond that, regular sanity checking.
Priority?
beta
Tests added?
yes
How to test?
Follow https://github.com/EveryVoiceTTS/EveryVoice/issues/460#issuecomment-2455849389 and see that
everyvoice preprocess
doesn't crash.Or use
everyvoice/tests/data/metadata_slash_pipe.psv
in the wizard and see thateveryvoice preprocess
doesn't crash.Confidence?
High for the coding, low for the choice of substitution string.
Version change?
no
Related PRs?
none