NielsRogge / Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.
MIT License
8.48k stars 1.33k forks source link

Error while loading datasets #307

Open Rohinivv96 opened 1 year ago

Rohinivv96 commented 1 year ago

I am trying to fine-tune the SAM model. I am using the load_dataset function from datasets and I got below values for my dataset:

Dataset({ features: ['image', 'label'], num_rows: 78 }) Now, When I tried to load the ground truth of the segmentation masks using the following command:

load ground truth segmentation

ground_truth_seg = np.array(dataset[idx]["label"]) print(ground_truth_seg) np.unique(ground_truth_seg.shape)

I am getting an empty array as follows: 0 array([], dtype=float64)

Can anyone help to solve this issue.

More context about my dataset: I am using a dataset where I have 2 folders 1 is containing images and other is containing images with mask

DhruvAwasthi commented 1 year ago

Try using the below code for creating the dataset:

dataset_dir = "path_to_dataset_dir"

# assuming the dataset dir contains two subdirectories - "images" containing images, and "masks" containing masks
image_paths = [os.path.join(dataset_dir, "images", image) for image in os.listdir(os.path.join(dataset_dir, "images"))]
label_paths = [os.path.join(dataset_dir, "masks", label) for label in os.listdir(os.path.join(dataset_dir, "masks"))]

def create_dataset(image_paths, label_paths):
    dataset = Dataset.from_dict({"image": sorted(image_paths),
                                "label": sorted(label_paths)})
    dataset = dataset.cast_column("image", Image())
    dataset = dataset.cast_column("label", Image())

    return dataset

dataset = create_dataset(image_paths, label_paths)