bene-ges / nemo_compatible

useful things that work with NVIDIA NeMo library
Apache License 2.0
9 stars 1 forks source link

Potential Typo #4

Closed thomaschhh closed 10 months ago

thomaschhh commented 1 year ago

I am trying to walk through the steps mentioned at https://github.com/bene-ges/nemo_compatible/blob/main/scripts/nlp/en_spellmapper/README.md and am running into:

''FileNotFoundError: [Errno 2] No such file or directory: 'yago.uniq3'"

Might it be the case that it's supposed be "yago.uniq2" instead?

https://github.com/bene-ges/nemo_compatible/blob/194af660d9b6d3d578884048d40b524775fd10e8/scripts/nlp/en_spellmapper/dataset_preparation/run_g2p.sh#L21

bene-ges commented 1 year ago

Yes, you are right, thanks!

Btw, check the newer version of the dataset on my huggingface page, it contains some intermediate files too.

Feel free to ask questions

thomaschhh commented 11 months ago

I think I found another typo: https://github.com/bene-ges/nemo_compatible/blob/194af660d9b6d3d578884048d40b524775fd10e8/scripts/nlp/en_spellmapper/dataset_preparation/prepare_corpora_after_alignment.py#L246

Based on Line 23 it should be extract_giza_alignments and not extract_alignments.

bene-ges commented 11 months ago

Thanks, fixed in .sh file

thomaschhh commented 11 months ago

https://github.com/bene-ges/nemo_compatible/blob/6c120745e8d42d406d4c19b14baecddf97500b92/scripts/nlp/en_spellmapper/dataset_preparation/preprocess_yago.sh#L1

I think NEMO_PATH should be NEMO_COMPATIBLE_PATH

bene-ges commented 11 months ago

Sure, fixed