required setup? - Githubissues

steviedrew67 commented 1 year ago

hi i'm very interested in trying this out but i'm running into errors in the environment and wondering if the setup is the issue.

running on paperspace. here is output of nvidia-smi

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

and below is the error:

RuntimeError Traceback (most recent call last) Input In [4], in <cell line: 1>() ----> 1 pipe = pipeline("text-classification-synthesis", 2 model="EleutherAI/gpt-neo-2.7B", 3 device=1)

File /usr/local/lib/python3.9/dist-packages/mutate/init.py:107, in pipeline(task, model, device, generation_kwargs, kwargs) 101 else: 102 raise ValueError( 103 f"Task - {task} is not supported. Supported tasks -" 104 f" {SUPPORTED_TASKS.keys()}" 105 ) --> 107 return pipeline_class(model, device, generation_kwargs)

File /usr/local/lib/python3.9/dist-packages/mutate/pipelines/text_classification.py:69, in TextClassificationSynthesize.init(self, model, device, generate_kwargs) 24 def init( 25 self, 26 model: Union[str, PreTrainedModel], 27 device: Optional[int] = -1, 28 generate_kwargs 29 ): 30 """ 31 Pipeline to synthesize Text classification examples from a given dataset. 32 (...) 67 68 """ ---> 69 self.infer = TextGeneration(model_name=model, device=device) 70 self._collate_fn = partial( 71 TextClassSynthesizePromptDataset._collate_fn, 72 self.infer.tokenizer, 73 self.infer.device, 74 ) 75 self.generate_kwargs = ( 76 self.generate_kwargs if not generate_kwargs else generate_kwargs 77 )

File /usr/local/lib/python3.9/dist-packages/mutate/infer.py:41, in TextGeneration.init(self, model_name, device) 39 self.tokenizer.pad_token = self.tokenizer.eos_token 40 self.device = torch.device(f"cuda:{device}" if device>=0 else "cpu") ---> 41 self.model.to(self.device) 42 self.model.eval()

File /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:927, in Module.to(self, *args, **kwargs) 923 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, 924 non_blocking, memory_format=convert_to_format) 925 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) --> 927 return self._apply(convert)

File /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:579, in Module._apply(self, fn) 577 def _apply(self, fn): 578 for module in self.children(): --> 579 module._apply(fn) 581 def compute_should_use_set_data(tensor, tensor_applied): 582 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): 583 # If the new tensor has compatible tensor type as the existing tensor, 584 # the current behavior is to change the tensor in-place using .data =, (...) 589 # global flag to let the user control whether they want the future 590 # behavior of overwriting the existing tensor or not.

File /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:602, in Module._apply(self, fn) 598 # Tensors stored in modules are graph leaves, and we don't want to 599 # track autograd history of param_applied, so we have to use 600 # with torch.no_grad(): 601 with torch.no_grad(): --> 602 param_applied = fn(param) 603 should_use_set_data = compute_should_use_set_data(param, param_applied) 604 if should_use_set_data:

File /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:925, in Module.to..convert(t) 922 if convert_to_format is not None and t.dim() in (4, 5): 923 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, 924 non_blocking, memory_format=convert_to_format) --> 925 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

RuntimeError: CUDA error: invalid device ordinal CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

infinitylogesh commented 1 year ago

Hi @steviedrew67 ,

Thank you for trying Mutate. I could see in your pipeline invocation code that the device is mentioned as 1, the argument expects the ordinal or device id of the GPU in your machine. From your nvidia-smi output, it looks like you have 1 GPU. So, the value has to be 0.

I am suspecting this to be the reason for the error. Could you please try that and let me know if you face any further issues? Would be happy to help

 pipe = pipeline("text-classification-synthesis",
                          model="EleutherAI/gpt-neo-2.7B",
                          device=1) <---- change to 0

steviedrew67 commented 1 year ago

Thank you for helping me solve this!

infinitylogesh / mutate

required setup? #5