LydiaXiaohongLi / Albert_Finetune_with_Pretrain_on_Custom_Corpus

1. Pretrain Albert on custom corpus 2. Finetune the pretrained Albert model on downstream task
33 stars 8 forks source link

error in running ALBERT create_pretraining_data.py #4

Closed YuBeomGon closed 4 years ago

YuBeomGon commented 4 years ago

Hi Hi Hi

Hi, thanks in advance for your share I try ot run create_pretraining_data.py (vocab.txt, I make it by referring your medium.)

python3 create_pretraining_data.py --input_file ../bert-vocab-builder/IMDB_review.txt --output_file output --vocab_file ../bert-vocab-builder/vocab.txt max_seq_length=64

def main(_): tf.logging.set_verbosity(tf.logging.INFO)

tokenizer = tokenization.FullTokenizer( vocab_file=FLAGS.vocab_file, do_lower_case=FLAGS.do_lower_case, spm_model_file=FLAGS.spm_model_file)

input_files = [] for input_pattern in FLAGS.input_file.split(","): input_files.extend(tf.gfile.Glob(input_pattern)) print(len(input_files)) # -> size is 1 print(type(input_files))

it didnt create the create_training_instances because I think input_files size is 1. what should I check?

my corpus is like below. IMDB_review.txt file is like below.head ../bert-vocab-builder/IMDB_review.txt One of the other reviewers has mentioned that after watching just Oz episode you ll be hooked They are right as this is exactly what happened with me The first thing that struck me about Oz was its brutality and unflinching scenes of violence which set in right from the word GO Trust me this is not show for the faint hearted or timid This show pulls no punches with regards to drugs sex or violence Its is hardcore in the classic use of the word It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary It focuses mainly on Emerald City an experimental section of the prison where all the cells have glass fronts and face inwards so privacy is not high on the agenda Em City is home to many Aryans Muslims gangstas Latinos Christians Italians Irish and more so scuffles death stares dodgy dealings and shady agreements are never far away would say the main appeal of the show is due to the fact that it goes where other shows wouldn dare Forget pretty pictures painted for mainstream audiences forget charm forget romance OZ doesn mess around The first episode ever saw struck me as so nasty it was surreal couldn say was ready for it but as watched more developed taste for Oz and got accustomed to the high levels of graphic violence Not just violence but injustice crooked guards who ll be sold out for nickel inmates who ll kill on order and get away with it well mannered middle class inmates being turned into prison bitches due to their lack of street skills or prison experience Watching Oz you may become comfortable with what is uncomfortable viewing thats if you can get in touch with your darker side . A wonderful little production The filming technique is very unassuming very old time BBC fashion and gives comforting and sometimes discomforting sense of realism to the entire piece The actors are extremely well chosen Michael Sheen not only has got all the polari but he has all the voices down pat too You can truly see the seamless editing guided by the references to Williams diary entries not only is it well worth the watching but it is terrificly written and performed piece masterful production about one of the great master of comedy and his life The realism really comes home with the little things the fantasy of the guard which rather than use the traditional dream techniques remains solid then disappears It plays on our knowledge and our senses particularly with the scenes concerning Orton and Halliwell and the sets particularly of their flat with Halliwell murals decorating every surface are terribly well done . I thought this was wonderful way to spend time on too hot summer weekend sitting in the air conditioned theater and watching light hearted comedy The plot is simplistic but the dialogue is witty and the characters are likable even the well bread suspected serial killer While some may be disappointed when they realize this is not Match Point Risk Addiction thought it was proof that Woody Allen is still fully in control of the style many of us have grown to love This was the most d laughed at one of Woody comedies in years dare say decade While ve never been impressed with Scarlet Johanson in this she managed to tone down her sexy image and jumped right into average but spirited young woman This may not be the crown jewel of his career but it was wittier than Devil Wears Prada and more interesting than Superman great comedy to go see with friends .

YuBeomGon commented 4 years ago

its closed, it takes long time, about 2 hours. thanks.