Open PhilipMay opened 1 year ago
see code examples of Philip in GerAlpacaDataCleaned
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de',
checkpoint_file='model1.pt:model2.pt:model3.pt:model4.pt',
tokenizer='moses', bpe='fastbpe')
_ = en2de.cuda()
en_de_texts = []
chunks = list(more_itertools.chunked(df["en"].tolist(), 10))
for chunk in tqdm(chunks):
en_de_texts.extend(en2de.translate(chunk))
Also could add facebook/nllb-200-distilled-600M
fairseq model dependencies are:
hydra-core
omegaconf
bitarray
sacrebleu
sacremoses
Cython
fastBPE
Problem with fastBPE: https://github.com/glample/fastBPE/issues/27#issuecomment-531544543
like this