Closed mshahriarinia closed 6 years ago
The solution was modifying fout_name2id.write(key + "/" + value["title"] + "\t" + str(index) + "\n")
to fout_name2id.write(key + "/" + value["title"] + "\t" + index.encode('utf-8') + "\n")
If you are using linux, change your locale. That helped me without changing any code.
Build fails
./extraction/full_preprocess.sh ${DATA_DIR} en
: