bigscience-workshop / t-zero

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
Apache License 2.0
456 stars 53 forks source link

how to reproduce the result in offline environment , I dowload the sentence.model and checkpoint but, the sentencepiece.model can not be recognised by the t5~!! #21

Closed flyingwaters closed 2 years ago

flyingwaters commented 2 years ago

hi @VictorSanh , i find Some problems when I reproduce your result. with t5==0.9.3, I use gpus to train the model and the environ is offline ,so I get sentencepiece.model downloaded and use this command --gin_param="tsv_dataset_fn.vocabulary = SentencePieceVocabulary()" --gin_param="get_sentencepiece_model_path = '/raid/yiptmp/huggingface-models/t5.1.1.lm100k.xxl'" ################## but it has some problems ,as follow: SyntaxError: malformed node or string: <_ast.Name object at 0x7f42f90404d0> Failed to parse token 'SentencePieceVocabulary' ###################### I think T0 is nice, Can you fix this bug? and I think the t5 you used, maybe has be updated, can you provided the requirement with version number. I think I will help many researchers to reproduce it and develop this tech further~ Thanks your work !!!! ###################### the download story_cloze data should be saved at ~/.cache ??? /i think there is no datapath in the command to add a task,

VictorSanh commented 2 years ago

duplicate of #20. closing this.