Closed ken2190 closed 1 year ago
Describe the bug
I'm trying to processe text normalization for HUI German Dataset but i get error like below. Does anyone have idea to resolve this issue?
Steps/Code to reproduce bug
(nemo) ubuntu@HP:/mnt/e/tools/nemo$ python /mnt/e/tools/NeMo/get_data.py --data-root /mnt/f/hui_acg --manifests-root /mnt/f/hui_acg/ --normalize-text [NeMo I 2023-09-27 13:31:45 get_data:105] Skipped downloading data because it exists: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/datasetStatisticClean.zip [NeMo I 2023-09-27 13:31:45 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/datasetStatisticClean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:32:21 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/datasetStatisticClean.zip. [NeMo I 2023-09-27 13:32:21 get_data:100] Downloading data: https://opendata.iisys.de/opendata/Datasets/HUI-Audio-Corpus-German/dataset_clean/Friedrich_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Friedrich_Clean.zip [NeMo I 2023-09-27 13:32:21 get_data:105] Skipped downloading data because it exists: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Bernd_Ungerer_Clean.zip [NeMo I 2023-09-27 13:32:21 get_data:105] Skipped downloading data because it exists: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Karlsson_Clean.zip [NeMo I 2023-09-27 13:32:21 get_data:105] Skipped downloading data because it exists: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/others_Clean.zip [NeMo I 2023-09-27 13:32:21 get_data:105] Skipped downloading data because it exists: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Eva_K_Clean.zip [NeMo I 2023-09-27 13:32:21 get_data:105] Skipped downloading data because it exists: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Hokuspokus_Clean.zip [NeMo I 2023-09-27 13:48:43 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Bernd_Ungerer_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:48:43 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Eva_K_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:48:43 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Friedrich_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:48:43 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Hokuspokus_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:48:43 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Karlsson_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:48:43 get_data:109] Unzipping data: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/others_Clean.zip --> /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean [NeMo I 2023-09-27 13:55:58 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Eva_K_Clean.zip. [NeMo I 2023-09-27 13:56:16 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Friedrich_Clean.zip. [NeMo I 2023-09-27 13:57:41 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Hokuspokus_Clean.zip. [NeMo I 2023-09-27 14:24:01 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Karlsson_Clean.zip. [NeMo I 2023-09-27 14:44:50 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/others_Clean.zip. [NeMo I 2023-09-27 14:55:32 get_data:111] Unzipping data is complete: /mnt/f/hui_acg/HUI-Audio-Corpus-German-clean/Bernd_Ungerer_Clean.zip. [NeMo I 2023-09-27 14:55:32 get_data:263] Processing Speaker: Alexandra_Bogensperger [NeMo I 2023-09-27 14:55:32 get_data:124] Preparing JSON split for speaker 1. 545it [00:00, 33618.58it/s] [NeMo I 2023-09-27 14:55:32 get_data:161] Preparing JSON split for speaker 1 is complete. [NeMo I 2023-09-27 14:55:32 get_data:263] Processing Speaker: Algy_Pug [NeMo I 2023-09-27 14:55:32 get_data:124] Preparing JSON split for speaker 2. 141it [00:00, 33753.60it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 2 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: AliceDe [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 3. 4it [00:00, 6700.17it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 3 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Anastasiia_Solokha [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 4. 4it [00:00, 9714.66it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 4 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Anka [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 5. 471it [00:00, 30448.79it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 5 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Anna_Samrowski [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 6. 5it [00:00, 8859.96it/s] [NeMo W 2023-09-27 14:55:33 get_data:158] Skipped speaker 6. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Anna_Simon [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 7. 19it [00:00, 20159.82it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 7 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Anne [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 8. 72it [00:00, 28945.64it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 8 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Antoinette_Huting [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 9. 55it [00:00, 25470.54it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 9 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Anton [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 10. 21it [00:00, 23227.95it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 10 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Apneia [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 11. 10it [00:00, 12136.30it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 11 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Availle [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 12. 532it [00:00, 29288.45it/s] [NeMo I 2023-09-27 14:55:33 get_data:161] Preparing JSON split for speaker 12 is complete. [NeMo I 2023-09-27 14:55:33 get_data:263] Processing Speaker: Bernd_Ungerer [NeMo I 2023-09-27 14:55:33 get_data:124] Preparing JSON split for speaker 13. 32880it [00:01, 28967.69it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 13 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: Boris [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 14. 257it [00:00, 26216.31it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 14 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: Capybara [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 15. 383it [00:00, 35107.60it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 15 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: caromopfen [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 16. 856it [00:00, 31935.28it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 16 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: Cate_Mackenzie [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 17. 12it [00:00, 18497.48it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 17 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: Christian_Al-Kadi [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 18. 408it [00:00, 30715.92it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 18 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: Christina_Lindgruen [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 19. 16it [00:00, 23522.21it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 19 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: ClaudiaSterngucker [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 20. 323it [00:00, 25071.44it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 20 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: ColOhr [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 21. 93it [00:00, 25849.59it/s] [NeMo I 2023-09-27 14:55:34 get_data:161] Preparing JSON split for speaker 21 is complete. [NeMo I 2023-09-27 14:55:34 get_data:263] Processing Speaker: Crln_Yldz_Ksr [NeMo I 2023-09-27 14:55:34 get_data:124] Preparing JSON split for speaker 22. 536it [00:00, 31930.28it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 22 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: DanielGrams [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 23. 29it [00:00, 25425.34it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 23 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: danio [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 24. 3it [00:00, 4822.89it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 24 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Desirée_Löffler [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 25. 18it [00:00, 21832.70it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 25 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Dini_Steyn [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 26. 2it [00:00, 3563.55it/s] [NeMo W 2023-09-27 14:55:35 get_data:158] Skipped speaker 26. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Dirk_Weber [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 27. 137it [00:00, 26192.89it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 27 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: DomBombadil [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 28. 183it [00:00, 28871.83it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 28 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Eki_Teebi [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 29. 139it [00:00, 28998.17it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 29 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: ekyale [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 30. 224it [00:00, 28817.11it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 30 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Elli [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 31. 286it [00:00, 25192.60it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 31 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Eva_K [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 32. 8539it [00:00, 22756.44it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 32 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Fabian_Grant [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 33. 32it [00:00, 31018.66it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 33 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: fantaeiner [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 34. 200it [00:00, 31004.61it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 34 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Franziska_Paul [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 35. 60it [00:00, 28807.03it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 35 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: fremdschaemen [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 36. 28it [00:00, 30432.89it/s] [NeMo I 2023-09-27 14:55:35 get_data:161] Preparing JSON split for speaker 36 is complete. [NeMo I 2023-09-27 14:55:35 get_data:263] Processing Speaker: Friedrich [NeMo I 2023-09-27 14:55:35 get_data:124] Preparing JSON split for speaker 37. 9590it [00:00, 31844.18it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 37 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Frown [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 38. 209it [00:00, 28455.81it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 38 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Gaby [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 39. 7it [00:00, 12671.61it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 39 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Gesine [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 40. 1it [00:00, 2270.87it/s] [NeMo W 2023-09-27 14:55:36 get_data:158] Skipped speaker 40. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: heeheekitty [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 41. 81it [00:00, 25623.25it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 41 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Herman_Roskams [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 42. 70it [00:00, 26438.66it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 42 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Herr_Klugbeisser [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 43. 43it [00:00, 18635.57it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 43 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Hokuspokus [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 44. 10584it [00:00, 31652.75it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 44 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Igor_Teaforay [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 45. 215it [00:00, 29591.63it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 45 is complete. [NeMo I 2023-09-27 14:55:36 get_data:263] Processing Speaker: Imke_Grassl [NeMo I 2023-09-27 14:55:36 get_data:124] Preparing JSON split for speaker 46. 136it [00:00, 21508.44it/s] [NeMo I 2023-09-27 14:55:36 get_data:161] Preparing JSON split for speaker 46 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Ingo_Breuer [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 47. 114it [00:00, 23795.69it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 47 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: IvanDean [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 48. 32it [00:00, 26854.29it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 48 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Jessi [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 49. 305it [00:00, 28271.00it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 49 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Joe_Kay [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 50. 35it [00:00, 18842.34it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 50 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: josimosi98 [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 51. 6it [00:00, 12912.17it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 51 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Julia_Niedermaier [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 52. 2012it [00:00, 29746.73it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 52 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Kaktus [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 53. 4it [00:00, 5797.24it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 53 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Kalynda [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 54. 394it [00:00, 22042.89it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 54 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Kanta [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 55. 3it [00:00, 5003.15it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 55 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Kara_Shallenberg [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 56. 52it [00:00, 26223.86it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 56 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: KarinM [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 57. 307it [00:00, 20873.95it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 57 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Karlsson [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 58. 10736it [00:00, 30896.67it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 58 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: keltoi [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 59. 860it [00:00, 26357.30it/s] [NeMo I 2023-09-27 14:55:37 get_data:161] Preparing JSON split for speaker 59 is complete. [NeMo I 2023-09-27 14:55:37 get_data:263] Processing Speaker: Klaus_Beutelspacher [NeMo I 2023-09-27 14:55:37 get_data:124] Preparing JSON split for speaker 60. 12it [00:00, 16496.77it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 60 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Klaus_Neubauer [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 61. 758it [00:00, 28217.40it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 61 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Knubbel [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 62. 3it [00:00, 5182.42it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 62 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Laila_Katinka [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 63. 1it [00:00, 1679.06it/s] [NeMo W 2023-09-27 14:55:38 get_data:158] Skipped speaker 63. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Larry_Greene [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 64. 16it [00:00, 15705.33it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 64 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Lars_Rolander_(1942-2016) [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 65. 745it [00:00, 28759.04it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 65 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Lektor [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 66. 3it [00:00, 6801.57it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 66 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: leserchen [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 67. 65it [00:00, 27876.25it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 67 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: LillY [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 68. 7it [00:00, 11829.22it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 68 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: lorda [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 69. 195it [00:00, 28319.29it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 69 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: LordOider [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 70. 21it [00:00, 9642.08it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 70 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: LyricalWB [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 71. 479it [00:00, 25347.54it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 71 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: manuwolf [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 72. 9it [00:00, 14485.32it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 72 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: marham63 [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 73. 1877it [00:00, 32591.92it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 73 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Markus_Wachenheim [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 74. 91it [00:00, 23905.90it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 74 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Martin_Harbecke [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 75. 102it [00:00, 22139.26it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 75 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Mat [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 76. 1it [00:00, 2383.13it/s] [NeMo W 2023-09-27 14:55:38 get_data:158] Skipped speaker 76. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Matze [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 77. 65it [00:00, 17870.33it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 77 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: melaniesandra [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 78. 24it [00:00, 23481.06it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 78 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: merendo07 [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 79. 10it [00:00, 15851.49it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 79 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: mindfulheart [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 80. 3it [00:00, 6253.93it/s] [NeMo W 2023-09-27 14:55:38 get_data:158] Skipped speaker 80. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: Monika_M._C [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 81. 326it [00:00, 31313.68it/s] [NeMo I 2023-09-27 14:55:38 get_data:161] Preparing JSON split for speaker 81 is complete. [NeMo I 2023-09-27 14:55:38 get_data:263] Processing Speaker: njall [NeMo I 2023-09-27 14:55:38 get_data:124] Preparing JSON split for speaker 82. 5it [00:00, 7492.50it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 82 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: noonday [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 83. 1it [00:00, 1560.96it/s] [NeMo W 2023-09-27 14:55:39 get_data:158] Skipped speaker 83. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Ohrbuch [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 84. 663it [00:00, 27358.73it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 84 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: OldZach [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 85. 125it [00:00, 19049.09it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 85 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Orsina [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 86. 6it [00:00, 8425.12it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 86 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: PeWaOt [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 87. 231it [00:00, 19847.27it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 87 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Ragnar [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 88. 330it [00:00, 16557.06it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 88 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Rainer [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 89. 94it [00:00, 23163.42it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 89 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Ralf [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 90. 1it [00:00, 2559.06it/s] [NeMo W 2023-09-27 14:55:39 get_data:158] Skipped speaker 90. Not enough data for train, val and test. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Ramona_Deininger-Schnabel [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 91. 1482it [00:00, 28377.02it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 91 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Rebecca_Braunert-Plunkett [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 92. 667it [00:00, 27243.96it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 92 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: RenateIngrid [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 93. 1274it [00:00, 27668.83it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 93 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Rhigma [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 94. 108it [00:00, 21351.10it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 94 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Robert_Steiner [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 95. 212it [00:00, 24120.23it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 95 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Rogthey [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 96. 960it [00:00, 25748.05it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 96 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Sandra_Schmit [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 97. 95it [00:00, 26779.95it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 97 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Sascha [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 98. 5it [00:00, 9035.55it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 98 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: schrm [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 99. 385it [00:00, 28601.41it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 99 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Sebastian [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 100. 125it [00:00, 29896.11it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 100 is complete. [NeMo I 2023-09-27 14:55:39 get_data:263] Processing Speaker: Sellafield [NeMo I 2023-09-27 14:55:39 get_data:124] Preparing JSON split for speaker 101. 10it [00:00, 15905.59it/s] [NeMo I 2023-09-27 14:55:39 get_data:161] Preparing JSON split for speaker 101 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Shanty [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 102. 28it [00:00, 27242.06it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 102 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Silke_Britz [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 103. 45it [00:00, 21924.00it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 103 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Silmaryllis [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 104. 336it [00:00, 14878.76it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 104 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Sonia [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 105. 473it [00:00, 30093.38it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 105 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Sonja [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 106. 130it [00:00, 23313.64it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 106 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: storylines [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 107. 21it [00:00, 26804.74it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 107 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Tabea [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 108. 13it [00:00, 18958.95it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 108 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Tanja_Ben_Jeroud [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 109. 105it [00:00, 24909.61it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 109 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: thinkofelephants [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 110. 6it [00:00, 9935.19it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 110 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Traxxo [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 111. 96it [00:00, 26051.58it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 111 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Ute2013 [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 112. 64it [00:00, 21387.58it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 112 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Verena [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 113. 58it [00:00, 21884.64it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 113 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Victoria_Asztaller [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 114. 165it [00:00, 29841.76it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 114 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Wolfgang [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 115. 60it [00:00, 26302.07it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 115 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Zach_K [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 116. 18it [00:00, 24020.83it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 116 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Zieraffe [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 117. 33it [00:00, 18107.28it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 117 is complete. [NeMo I 2023-09-27 14:55:40 get_data:263] Processing Speaker: Zue_Von_Zob [NeMo I 2023-09-27 14:55:40 get_data:124] Preparing JSON split for speaker 118. 11it [00:00, 13261.67it/s] [NeMo I 2023-09-27 14:55:40 get_data:161] Preparing JSON split for speaker 118 is complete. [NeMo I 2023-09-27 14:55:40 get_data:298] Saving Speaker to ID mapping to /mnt/f/hui_acg/spk2id.csv. [NeMo I 2023-09-27 14:55:40 get_data:115] Saving JSON split to /mnt/f/hui_acg/train_manifest.json. [NeMo I 2023-09-27 14:55:42 get_data:115] Saving JSON split to /mnt/f/hui_acg/val_manifest.json. [NeMo I 2023-09-27 14:55:42 get_data:115] Saving JSON split to /mnt/f/hui_acg/test_manifest.json. [NeMo I 2023-09-27 14:56:17 get_data:196] Normalizing text for /mnt/f/hui_acg/train_manifest.json. 10%|███████▌ | 8692/86811 [28:47<7:53:04, 2.75it/s]joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 273, in _wrap_func_call return func() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 588, in __call__ return [func(*args, **kwargs) File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 588, in <listcomp> return [func(*args, **kwargs) File "/mnt/e/tools/NeMo/get_data.py", line 192, in add_normalized_text normalized_text = normalizer_call(line_dict["text"]) File "/mnt/e/tools/NeMo/get_data.py", line 189, in normalizer_call return text_normalizer.normalize(x, **text_normalizer_call_kwargs) File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/normalize.py", line 328, in normalize tokens = self.parser.parse() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/token_parser.py", line 53, in parse token = self.parse_token() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/token_parser.py", line 76, in parse_token value = self.parse_token_value() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/token_parser.py", line 98, in parse_token_value list_token_dicts = self.parse() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/token_parser.py", line 53, in parse token = self.parse_token() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/token_parser.py", line 67, in parse_token key = self.parse_string_key() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/nemo_text_processing/text_normalization/token_parser.py", line 141, in parse_string_key assert self.char not in string.whitespace and self.char != EOS AssertionError """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/mnt/e/tools/NeMo/get_data.py", line 316, in <module> main() File "/mnt/e/tools/NeMo/get_data.py", line 310, in main __text_normalization(train_json, args.num_workers) File "/mnt/e/tools/NeMo/get_data.py", line 201, in __text_normalization dict_list = Parallel(n_jobs=num_workers, backend="threading")( File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 1944, in __call__ return output if self.return_generator else list(output) File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 1587, in _get_outputs yield from self._retrieve() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 1691, in _retrieve self._raise_error_fast() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 1726, in _raise_error_fast error_job.get_result(self.timeout) File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 735, in get_result return self._return_or_raise() File "/home/ubuntu/miniconda3/envs/nemo/lib/python3.10/site-packages/joblib/parallel.py", line 753, in _return_or_raise raise self._result AssertionError 10%|███████▋ | 8735/86811 [28:50<4:17:48, 5.05it/s]
Environment overview
Environment details
Additional context GPU model: RTX3090
I tried to add arg --num-workers 1 and it worked.
Describe the bug
I'm trying to processe text normalization for HUI German Dataset but i get error like below. Does anyone have idea to resolve this issue?
Steps/Code to reproduce bug
Environment overview
Environment details
Additional context GPU model: RTX3090