-
I got the mecab setup in the right location as mentioned in the docs. But I am not able to get the japanese tokenization working. Anyone seen this before ?
```
!echo "雪の風景" | python3 ./LASER/sourc…
-
Hi,
I have a question about the `learnbpe` operation. The example in the `README.md` learn bpecodes together for `en` and `de`, and then apply code for `en` and `de` separately..
```
./fast learn…
-
Great work~
while I fellow the instructs and finish all the environmental configuration, run usage example
2023-11-29 12:14:00 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX…
-
I am running the CodeGen using the test repository (https://github.com/facebookresearch/CodeGen/tree/main/data/test_dataset) for obfuscation mode
`run codegen_sources/preprocessing/preprocess.py dat…
-
Hi,
Could you please release the preprocessing codes for generating the structural sequence and the commands for applying bpe? i.e., how to get the files in [corpus_sample/all_path_corpus](https://…
QAQ-v updated
2 years ago
-
Thank you for sharing your codes.
I have a question about how to preprocess the data. For example, for the iwslt en-de dataset, you use a file named train.tags.en-de.bpe.dev.en2 in the script run_al…
-
Hopefully, this is a simple question but I'm struggling and could use help.
I could run the first example, but am stuck here:
```
import torch
from src.transformer_lm_prompt import Transform…
-
Hello,
Could anybody please guide me that how I can run the standard BioGPT model by using the current below code?
`import torch
from fairseq.models.transformer_lm import TransformerLanguageMod…
-
I am trying to run the preprocessing.py file and getting this unknow error. Can you tell me how to resolve this.
````
run codegen_sources/preprocessing/preprocess.py data/test_dataset --mode obfu…
-
Hi,
Thanks for this project, it looks like it could be really helpful. Sorry if this is a stupid question but I was wondering, once I've tokenized a set of SMILES using the pre-trained SMILES model…