ValueError: Number of processes must be at least 1

Magpi007 commented 5 years ago

Hi,

When training the model, I get this error:

I am just running the code for the first time, I haven't checked it too much yet...

ThilinaRajapakse commented 5 years ago

~~I thought I fixed this, sorry.~~

~~In this function definition, change the default value of process count to 1 for Google Colab. (Colab has 1 vCPU if I remember correctly).~~

Edit: Couldn't reproduce the error when I ran the notebook. The Colab notebook specifies a process count of 2 when calling convert_examples_to_features()

Magpi007 commented 5 years ago

So this change is only for Colab? If I implement it in my local laptop, can I use the one that is in the repo?

ThilinaRajapakse commented 5 years ago

Yes, the local version will work fine. By default, the process_count is set to number of CPU cores available - 2. On a modern computer, you will certainly have more than 2 so it's fine. ~~But for Colab, the number is 1 so that makes the process_count -1, which throws the error because at least 1 process is needed.~~

Edit: The cpu_count on Colab is two, and the notebook is configured to use 2 as the process_count.

ThilinaRajapakse commented 5 years ago

I ran the Colab notebook again, it works without issues. The above change is unnecessary.

Magpi007 commented 5 years ago

Mmm and what could it be? I just run it again and got the same error. These are the resources that I have allocated:

ThilinaRajapakse commented 5 years ago

Try setting process_count to 1 in the call to convert_examples_to_features() inside the load_and_cache_examples() function.

features = convert_examples_to_features(examples, label_list, args['max_seq_length'], tokenizer, output_mode,
            cls_token_at_end=bool(args['model_type'] in ['xlnet']),            # xlnet has a cls token at the end
            cls_token=tokenizer.cls_token,
            sep_token=tokenizer.sep_token,
            cls_token_segment_id=2 if args['model_type'] in ['xlnet'] else 0,
            pad_on_left=bool(args['model_type'] in ['xlnet']),                 # pad on the left for xlnet
            pad_token_segment_id=4 if args['model_type'] in ['xlnet'] else 0,
            process_count=1)

Magpi007 commented 5 years ago

I got this error when I changed it:

Anyway, let me review the code, because I have been disconnected last days from this, so I want to check that I have been following all the steps correctly.

ThilinaRajapakse commented 5 years ago

Are you using a local copy (local to your Google Drive, that is)? I think this bug was there in the original notebook, but it was fixed later. The function should look like this:

def convert_examples_to_features(examples, label_list, max_seq_length,
                                 tokenizer, output_mode,
                                 cls_token_at_end=False, pad_on_left=False,
                                 cls_token='[CLS]', sep_token='[SEP]', pad_token=0,
                                 sequence_a_segment_id=0, sequence_b_segment_id=1,
                                 cls_token_segment_id=1, pad_token_segment_id=0,
                                 mask_padding_with_zero=True,
                                 process_count=cpu_count() - 2):
    """ Loads a data file into a list of `InputBatch`s
        `cls_token_at_end` define the location of the CLS token:
            - False (Default, BERT/XLM pattern): [CLS] + A + [SEP] + B + [SEP]
            - True (XLNet/GPT pattern): A + [SEP] + B + [SEP] + [CLS]
        `cls_token_segment_id` define the segment id associated to the CLS token (0 for BERT, 2 for XLNet)
    """

    label_map = {label : i for i, label in enumerate(label_list)}

    examples = [(example, label_map, max_seq_length, tokenizer, output_mode, cls_token_at_end, cls_token, sep_token, cls_token_segment_id, pad_on_left, pad_token_segment_id) for example in examples]

    with Pool(process_count) as p:
        features = list(tqdm(p.imap(convert_example_to_feature, examples, chunksize=100), total=len(examples)))

    return features

Magpi007 commented 5 years ago

Yeah maybe is that. I know is better to fork to your repo so we have the updates/fixes instantly, but I like first to understand the code recreating it in my own notebook. I am using Colab linked to Google Drive. I will check that and I will let you know. Thanks.

ThilinaRajapakse commented 5 years ago

Understandable! Let me know how it goes.

Magpi007 commented 5 years ago

With fresh head is more easy to see clearly. There were two things that I changed and make it worked:

When making dataframes BERT friendly I didn't pass the parameter columns=train_df_bert.columns to the to.csv function.
In the function load_and_cache_examples I was not including the colab add-in parameter undersample_scale_factor=0.1.

Maybe the first point was the one causing the error? Anyway sorry for my lapsus, I will keep iterating it and let you know if I see any suspicious bug.

ThilinaRajapakse commented 5 years ago

Weird. Neither of those things should be throwing a "number of processes" error as far as I can tell. That error comes from the multiprocessing used for converting examples to features. Oh well, we don't need to worry about it if it's working!

Magpi007 commented 5 years ago

I was facing again this problem with another iteration and I changed this:

process_count = cpu_count() - 2

for this

process_count = 1

in the function convert_examples_to_features of the utils.py file, and it worked. I am working on Colab. It makes sense to you?

ThilinaRajapakse commented 5 years ago

Yes, that would fix all multiprocessing related issues at the expense of not using multiprocessing at all. I think you can get away with setting it to 2 on Colab. Setting it to 2 should speed things up a bit but setting it to 1 will ensure that you won't get multiprocessing related errors.

ThilinaRajapakse / pytorch-transformers-classification

ValueError: Number of processes must be at least 1 #4