Closed nateraw closed 5 years ago
@nateraw Sorry about my mistake. The README file is just updated to
bert -c data/corpus.small -v data/vocab.small -o output/bert.model
which is same corpus with bert-vocab
and the corpus example is on README 0.prepare your own corpus.
thanx
Why might this be happening then? I ran these lines...
bert-vocab -c data/dummy_data.small -o data/vocab.small
bert -c data/dummy_data.small -v data/vocab.small -o output/bert.model
Dummy data looks like this:
@nateraw Can you update the bert-pytorch version to 0.0.1a4?
pip install -U bert-pytorch
Interestingly, a different error:
@nateraw I got what was wrong with both your example corpus and mine.
We should not make a blank line in end of the line! If you check the line 18 at your corpus,
\n
is declared on last line, which means that python recognize there is one more extra line.
Please remove the \n
of end of character at the end of line. that would be help to fix it
Wow so dumb! My fault, I will report back if that change works or not.
Issue was with the tabs as well. Replaced with literal tabs, and it worked. Closing the issue, thank you!
Could you give a concrete example of the input data? You gave an example of the corpus data, but not the dataset.small file found in this line:
bert -c data/dataset.small -v data/vocab.small -o output/bert.model
If you could show perhaps a couple of examples, that would be very helpful! I am new to pytorch, so the dataloader function is a little confusing.