Closed zjiehang closed 5 years ago
I met the same problem, have you solved it?
I met the same problem, have you solved it? @zhijuny Yeah, it happens at question 24494, the answer "menid Empi" may be incomplete, I rewrite it to "Achaemenid Empire" and the bug solved. I have checked it in Wikipedia, the complete answer is right. I found this problem only happened in Ubuntu (or maybe Linux? ) system, while in Windows system, no errors happened, further, there existed some special characters in question 24494 (you can check the context of question 24494), so maybe the character encoding methods in different systems cause this problem. Hope it helps!
I met the same problem, have you solved it? @zhijuny Yeah, it happens at question 24494, the answer "menid Empi" may be incomplete, I rewrite it to "Achaemenid Empire" and the bug solved. I have checked it in Wikipedia, the complete answer is right. I found this problem only happened in Ubuntu (or maybe Linux? ) system, while in Windows system, no errors happened, further, there existed some special characters in question 24494 (you can check the context of question 24494), so maybe the character encoding methods in different systems cause this problem. Hope it helps!
Thank you. I'll have a try
I also met this problem. Running
sed -ie "s/\"menid Empi\"/\"Achaemenid Empire\"/g" hotpot_train_v1.1.json
before
python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train
solves it.
I got error when preprocessing the data ........................................................................... /home/akshay/akshayproj/prepro.py in _process(sent=u'Adam Collis', is_sup_fact=False, is_title=True) 93 flat_offsets = [] 94 start_end_facts = [] # (start_token_id, end_token_id, is_sup_fact=True/False) 95 sent2title_ids = [] 96 97 def _process(sent, is_sup_fact, is_title=False): ---> 98 nonlocal (text_context, context_tokens, context_chars, offsets, start_end_facts, flat_offsets) text_context = undefined 99 N_chars = len(text_context) 100 101 sent = sent 102 sent_tokens = word_tokenize(sent)
NameError: global name 'nonlocal' is not defined
I got error when preprocessing the data ........................................................................... /home/akshay/akshayproj/prepro.py in _process(sent=u'Adam Collis', is_sup_fact=False, is_title=True) 93 flat_offsets = [] 94 start_end_facts = [] # (start_token_id, end_token_id, is_sup_fact=True/False) 95 sent2title_ids = [] 96 97 def _process(sent, is_sup_fact, is_title=False): ---> 98 nonlocal (text_context, context_tokens, context_chars, offsets, start_end_facts, flat_offsets) text_context = undefined 99 N_chars = len(text_context) 100 101 sent = sent 102 sent_tokens = word_tokenize(sent)
NameError: global name 'nonlocal' is not defined
Are you using python 2.x? The keyword 'nonlocal' is only used in python3.x, please check it.
I get an out-of-range error while processing the training dataset. It happens at around 24263 questions. Thanks!