Closed Magpi007 closed 5 years ago
I thought I fixed this, sorry.
In this function definition, change the default value of process count to 1 for Google Colab. (Colab has 1 vCPU if I remember correctly).
Edit: Couldn't reproduce the error when I ran the notebook. The Colab notebook specifies a process count of 2 when calling convert_examples_to_features()
So this change is only for Colab? If I implement it in my local laptop, can I use the one that is in the repo?
Yes, the local version will work fine. By default, the process_count is set to number of CPU cores available - 2
. On a modern computer, you will certainly have more than 2 so it's fine. But for Colab, the number is 1 so that makes the process_count -1
, which throws the error because at least 1 process is needed.
Edit: The cpu_count on Colab is two, and the notebook is configured to use 2 as the process_count.
I ran the Colab notebook again, it works without issues. The above change is unnecessary.
Mmm and what could it be? I just run it again and got the same error. These are the resources that I have allocated:
Try setting process_count to 1 in the call to convert_examples_to_features()
inside the load_and_cache_examples()
features = convert_examples_to_features(examples, label_list, args['max_seq_length'], tokenizer, output_mode,
cls_token_at_end=bool(args['model_type'] in ['xlnet']), # xlnet has a cls token at the end
cls_token_segment_id=2 if args['model_type'] in ['xlnet'] else 0,
pad_on_left=bool(args['model_type'] in ['xlnet']), # pad on the left for xlnet
pad_token_segment_id=4 if args['model_type'] in ['xlnet'] else 0,
I got this error when I changed it:
Anyway, let me review the code, because I have been disconnected last days from this, so I want to check that I have been following all the steps correctly.
Are you using a local copy (local to your Google Drive, that is)? I think this bug was there in the original notebook, but it was fixed later. The function should look like this:
def convert_examples_to_features(examples, label_list, max_seq_length,
tokenizer, output_mode,
cls_token_at_end=False, pad_on_left=False,
cls_token='[CLS]', sep_token='[SEP]', pad_token=0,
sequence_a_segment_id=0, sequence_b_segment_id=1,
cls_token_segment_id=1, pad_token_segment_id=0,
process_count=cpu_count() - 2):
""" Loads a data file into a list of `InputBatch`s
`cls_token_at_end` define the location of the CLS token:
- False (Default, BERT/XLM pattern): [CLS] + A + [SEP] + B + [SEP]
- True (XLNet/GPT pattern): A + [SEP] + B + [SEP] + [CLS]
`cls_token_segment_id` define the segment id associated to the CLS token (0 for BERT, 2 for XLNet)
label_map = {label : i for i, label in enumerate(label_list)}
examples = [(example, label_map, max_seq_length, tokenizer, output_mode, cls_token_at_end, cls_token, sep_token, cls_token_segment_id, pad_on_left, pad_token_segment_id) for example in examples]
with Pool(process_count) as p:
features = list(tqdm(p.imap(convert_example_to_feature, examples, chunksize=100), total=len(examples)))
return features
Yeah maybe is that. I know is better to fork to your repo so we have the updates/fixes instantly, but I like first to understand the code recreating it in my own notebook. I am using Colab linked to Google Drive. I will check that and I will let you know. Thanks.
Understandable! Let me know how it goes.
With fresh head is more easy to see clearly. There were two things that I changed and make it worked:
to the to.csv
I was not including the colab add-in parameter undersample_scale_factor=0.1
.Maybe the first point was the one causing the error? Anyway sorry for my lapsus, I will keep iterating it and let you know if I see any suspicious bug.
Weird. Neither of those things should be throwing a "number of processes" error as far as I can tell. That error comes from the multiprocessing used for converting examples to features. Oh well, we don't need to worry about it if it's working!
I was facing again this problem with another iteration and I changed this:
process_count = cpu_count() - 2
for this
process_count = 1
in the function convert_examples_to_features
of the
file, and it worked. I am working on Colab. It makes sense to you?
Yes, that would fix all multiprocessing related issues at the expense of not using multiprocessing at all. I think you can get away with setting it to 2 on Colab. Setting it to 2 should speed things up a bit but setting it to 1 will ensure that you won't get multiprocessing related errors.
When training the model, I get this error:
I am just running the code for the first time, I haven't checked it too much yet...