ftyers / commonvoice-utils

Linguistic processing for Common Voice
GNU Affero General Public License v3.0
51 stars 14 forks source link

[FR] Make "Function not implemented" errors only valid for "covo" #51

Open HarikalarKutusu opened 9 months ago

HarikalarKutusu commented 9 months ago

If you are using the code directly in Python, you still get "Function not implemented" errors. I do check the existing functionality before calling them, but this time, when using the phonemiser class, which further calls the validator, it is not possible.

Here is what I get when analyzing 144 corpora in parallel and using a progressbar:

=== Text-Corpora Compilation Process for cv-tbox-dataset-compiler ===
Processing text-corpora for 144 locales in 12 processes with chunk_size 10...

  0%|                                                                                                                                                                                       | 0/144 [00:00<?, ?it/s][Validator] Function not implemented
[Validator] Function not implemented
[Validator] Function not implemented
  1%|█▏                                                                                                                                                                             | 1/144 [00:10<25:29, 10.69s/it][Validator] Function not implemented
  8%|█████████████▎                                                                                                                                                                | 11/144 [00:15<02:30,  1.13s/it][Validator] Function not implemented
[Validator] Function not implemented
 15%|█████████████████████████▍                                                                                                                                                    | 21/144 [00:28<02:34,  1.25s/it][Validator] Function not implemented
 22%|█████████████████████████████████████▍                                                                                                                                        | 31/144 [00:30<01:25,  1.33it/s][Validator] Function not implemented
 31%|██████████████████████████████████████████████████████▍                                                                                                                       | 45/144 [03:18<08:47,  5.33s/it][Validator] Function not implemented
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 144/144 [12:34<00:00,  5.24s/it]
Finished compiling text-corpus for 144 locales in 754.62 avg=5.24 sec/locale