使用paddle预处理mlm任务，调用paddle.bernoulli方法时，gpu报错cpu正常运行

jyjfjyjf commented 2 years ago

bug描述 Describe the Bug

报错环境 win11 python 3.7 rtx3090 cuda 11.2 paddlepaddle-gpu 2.3

代码：

def masked_fill(x, mask, value):
    y = paddle.full(x.shape, value, x.dtype)
    return paddle.where(mask, y, x)

class MLMDataset(Dataset, ABC):
    def __init__(self, tokenizer, args):

        super().__init__()
        data = []
        with open(args.train_data_path, 'r', encoding='utf-8') as f:
            data.extend(json.load(f))
        with open(args.valid_data_path, 'r', encoding='utf-8') as f:
            data.extend(json.load(f))
        with open(args.test_data_path, 'r', encoding='utf-8') as f:
            data.extend(json.load(f))

        self.tokenizer = tokenizer
        self.mlm_probability = 0.15

        self.input_ids = []
        self.attention_mask = []
        self.type_ids = []
        self.labels = []
        for d in tqdm(data, desc='encode text'):
            text = d['sentence']

            encodings = tokenizer(text,
                                  max_length=256,
                                  truncation=True,
                                  stride=64,
                                  return_overflowing_tokens=True,
                                  return_special_tokens_mask=True,
                                  return_attention_mask=True)

            special_tokens_mask = encodings['special_tokens_mask']
            special_tokens_mask = [1] + special_tokens_mask + [1]
            input_ids, labels = self.paddle_mask_tokens(paddle.to_tensor(encodings['input_ids']),
                                                        paddle.to_tensor(special_tokens_mask))
            self.input_ids.append(input_ids.tolist())
            self.attention_mask.append(encodings['attention_mask'])
            self.type_ids.append(encodings['token_type_ids'])
            self.labels.append(labels.tolist())

    def __len__(self):
        return len(self.input_ids)

    def __getitem__(self, item):
        return {
            'input_ids': self.input_ids[item],
            'attention_mask': self.attention_mask[item],
            'type_ids': self.type_ids[item],
            'label': self.labels[item]
        }

    def paddle_mask_tokens(self, inputs: Any, special_tokens_mask: Optional[Any] = None) -> Tuple[Any, Any]:
        """
        Prepare masked tokens inputs/labels for masked language modeling: 80% MASK, 10% random, 10% original.
        """

        labels = inputs.clone()
        # We sample a few tokens in each sequence for MLM training (with probability `self.mlm_probability`)
        probability_matrix = paddle.full(labels.shape, self.mlm_probability)
        if special_tokens_mask is None:
            special_tokens_mask = [
                self.tokenizer.get_special_tokens_mask(val, already_has_special_tokens=True) for val in labels.tolist()
            ]
            special_tokens_mask = paddle.to_tensor(special_tokens_mask, dtype=paddle.bool)
        else:
            special_tokens_mask = special_tokens_mask.astype(paddle.bool)

        probability_matrix = masked_fill(probability_matrix, special_tokens_mask, 0.0)
        masked_indices = paddle.bernoulli(probability_matrix).astype(paddle.bool)
        labels[~masked_indices] = -100  # We only compute loss on masked tokens

        # 80% of the time, we replace masked input tokens with tokenizer.mask_token ([MASK])
        indices_replaced = paddle.bernoulli(paddle.full(labels.shape, 0.8)).astype(paddle.bool) & masked_indices
        inputs = masked_fill(inputs, indices_replaced, self.tokenizer.convert_tokens_to_ids(self.tokenizer.mask_token))
        # 10% of the time, we replace masked input tokens with random word
        indices_random = paddle.bernoulli(paddle.full(labels.shape, 0.5)).astype(paddle.bool) & masked_indices & ~indices_replaced
        random_words = paddle.randint(high=len(self.tokenizer), shape=labels.shape, dtype=paddle.int64)
        inputs = paddle.where(indices_random, random_words, inputs)

        # The rest of the time (10% of the time) we keep the masked input tokens unchanged
        return inputs, labels

报错错信息： ''' File "F:/workspace/CCAC2022_event_detect/src/pretrain.py", line 93, in paddle_mask_tokens indices_replaced = paddle.bernoulli(paddle.full(labels.shape, 0.8)).astype(paddle.bool) & masked_indices File "D:\sd\Anaconda\Anaconda\envs\paddle_2_3\lib\site-packages\paddle\tensor\random.py", line 74, in bernoulli return _C_ops.bernoulli(x) SystemError: (Fatal) Operator bernoulli raises an class thrust::system::system_error exception. The exception content is :transform: failed to synchronize: cudaErrorLaunchFailure: unspecified launch failure. (at ..\paddle\fluid\imperative\tracer.cc:307)'''

在aistudio 上正常运行的截图本地使用paddlepaddle-cpu正常运行的截图

其他补充信息 Additional Supplementary Information

No response

paddle-bot[bot] commented 2 years ago

您好，我们已经收到了您的问题，会安排技术人员尽快解答您的问题，请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时，您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快～

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API，FAQ，Github Issue and AI community to get the answer.Have a nice day!

jyjfjyjf commented 2 years ago

本地使用paddlepaddle-gpu 2.3.0报错截图

jyjfjyjf commented 2 years ago

我一样的代码，啥都没改，本地就不行，aistudio gpu可以，本地cpu也可以

jyjfjyjf commented 2 years ago

aistudio上进行mlm预训练都弄完了

ziyoujiyi commented 2 years ago

bernoulli 是第一个 cuda op 吧，能先check下windows装的gpu驱动是否没问题？

ziyoujiyi commented 2 years ago

可以先试试其他 cuda op，如果还有问题，就是本地环境问题，如果没有问题，我们在定位下这个 op 是不是有 bug

jyjfjyjf commented 2 years ago

gpu驱动应该没问题，我用torch-gpu cuda 11.6， torch-gpu cuda 11.2都没事

jyjfjyjf commented 2 years ago

其他的op有问题的我提bug了，然后你们复现出来，今天回复我了大部分op都没问题，不过有一个问题我本地都跑不了

PaddlePaddle / Paddle

使用paddle预处理mlm任务，调用paddle.bernoulli方法时，gpu报错cpu正常运行 #44197

bug描述 Describe the Bug

其他补充信息 Additional Supplementary Information