Closed sids07 closed 3 years ago
Just to confirm, could you try;
- name: rasa_nlu_examples.featurizers.SparseBytePairFeaturizer
lang: hi
vs: 1000
The cached use-case is more for folks who want to pre-build docker containers. If you don't pass a folder it should automatically fetch the file if it doesn't exist.
Also! I don't know what dataset you're running this on, but if you have a representative dataset I'd love to hear if these tools increase the performance of your assistant.
@koaning i have tried the same which you are referring at my first try which automatically downloaded files for hindi languages under hi directory within cache_dir but still it asked for english language file.
Gotya. I think I've indeed found the bug here https://github.com/RasaHQ/rasa-nlu-examples/blob/main/rasa_nlu_examples/featurizers/sparse/sparse_bpemb_featurizer.py#L367.
Made a PR here: https://github.com/RasaHQ/rasa-nlu-examples/pull/140.
The PR should contain the fix, if it's still broken, feel free to re-open the issue!
it is still not working @koaning as per this PR made on #140 we still have to change it on line 379:
In current update:
model_fp = (
Path(cache_dir)
/ self.component_config["lang"]
/ f"en.wiki.bpe.vs{self.component_config['vs']}.model"
)
new changes to be made for working with other languages not english
model_fp = (
Path(cache_dir)
/ self.component_config["lang"]
/ f"{self.component_config['lang']}.wiki.bpe.vs{self.component_config['vs']}.model"
)
I was trying to apply sparsebytefeaturizer for Hindi language and given the only cache_dir then model for Hindi language is downloaded but after download, it searches for English language model on the cache_dir which obviously is not present there so, it throws no file found error.
my config.yml:
my file directories have cache_dir folder and it subfolders as: .. ... cache_dir: -- hi: ---- hi.wiki.bpe.vs1000.model ---- hi.wiki.bpe.vs1000.d25.w2v.bin ... ..
Error Message: