salesforce / CodeTF

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Apache License 2.0
1.45k stars 100 forks source link

Error at loading the MPP dataset #44

Open crarojasca opened 1 year ago

crarojasca commented 1 year ago

There appears to be an error in the file name that contains the base class used for the MBPPDataset dataset. It seems that the correct name should be "base_dataset" instead of "base_dataloader."

╭─────────────────────────────── Traceback (most recent call last)────────────────────────────────╮
│ /fs04/qe26/PEFT/finetuning.py:3 in <module>                                                      │
│                                                                                                  │
│    1 from codetf.trainer.causal_lm_trainer import CausalLMTrainer                                │
│    2 # from codetf.data_utility.human_eval_dataset import HumanDataset                           │
│ ❱  3 from codetf.data_utility.mpp_dataset import MBPPDataset                                     │
│    4 # from codetf.data_utility.codexglue_dataset import CodeXGLUEDataset                        │
│    5 from codetf.models import load_model_pipeline                                               │
│    6 from codetf.performance.evaluation_metric import EvaluationMetric                           │
│                                                                                                  │
│ /scratch/qe26/crojasca/miniconda/conda/envs/jupyterlab/lib/python3.10/site-packages/codetf/data_ │
│ utility/mpp_dataset.py:6 in <module>                                                             │
│                                                                                                  │
│    3 import torch                                                                                │
│    4 import torch.nn.functional as F                                                             │
│    5 from datasets import load_dataset                                                           │
│ ❱  6 from codetf.data_utility.base_dataloader import BaseDataset                                 │
│    7 # from torch.utils.data import TensorDataset                                                │
│    8                                                                                             │
│    9 class MBPPDataset(BaseDataset):                                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'codetf.data_utility.base_dataloader'
Paul-B98 commented 1 year ago

@crarojasca are u installing the package from pip or from github?