snorkel-team / snorkel-extraction

A previous version of Snorkel focused on information extraction
Apache License 2.0
34 stars 27 forks source link

A little change about dataloader #4

Closed yinxiangshi closed 2 years ago

yinxiangshi commented 3 years ago

When I try to run this tutorial, I found some bugs so I make a little change to dataloader In numpy.load, the parameter 'allow_pickle' is false, if so, the dataloader will crack, so I changed it. `class DataLoader(object): def init(self, data_path='/data/'):

fix SSL certificate issues when loading images via HTTPS

    import ssl; ssl._create_default_https_context = ssl._create_unverified_context

    current_dir = os.getcwd()
    self.data_path = current_dir + data_path

    def load_train_attr(self):
        self.train_mscoco = np.load(self.data_path + 'train_mscoco.npy',allow_pickle=True)
        self.train_vg = np.load(self.data_path + 'train_vg.npy',allow_pickle=True)
        self.train_vg_idx = np.load(self.data_path + 'train_vg_idx.npy',allow_pickle=True)
        self.train_ground = np.load(self.data_path + 'train_ground.npy',allow_pickle=True)

        self.train_object_names = np.load(self.data_path + 'train_object_names.npy',allow_pickle=True)
        self.train_object_x = np.load(self.data_path + 'train_object_x.npy',allow_pickle=True)
        self.train_object_y = np.load(self.data_path + 'train_object_y.npy',allow_pickle=True)
        self.train_object_height = np.load(self.data_path + 'train_object_height.npy',allow_pickle=True)
        self.train_object_width = np.load(self.data_path + 'train_object_width.npy',allow_pickle=True)

    def load_val_attr(self):
        self.val_mscoco = np.load(self.data_path + 'val_mscoco.npy',allow_pickle=True)
        self.val_vg = np.load(self.data_path + 'val_vg.npy',allow_pickle=True)
        self.val_vg_idx = np.load(self.data_path + 'val_vg_idx.npy',allow_pickle=True)
        self.val_ground = np.load(self.data_path + 'val_ground.npy',allow_pickle=True)

        self.val_object_names = np.load(self.data_path + 'val_object_names.npy',allow_pickle=True)
        self.val_object_x = np.load(self.data_path + 'val_object_x.npy',allow_pickle=True)
        self.val_object_y = np.load(self.data_path + 'val_object_y.npy',allow_pickle=True)
        self.val_object_height = np.load(self.data_path + 'val_object_height.npy',allow_pickle=True)
        self.val_object_width = np.load(self.data_path + 'val_object_width.npy',allow_pickle=True)

    load_train_attr(self)
    self.train_num = np.shape(self.train_object_names)[0]
    load_val_attr(self)
    self.val_num = np.shape(self.val_object_names)[0]

    with open(self.data_path + 'image_data.json') as json_data:
        self.data = json.load(json_data`
paroma commented 3 years ago

The updated tutorials are in the snorkel-tutorials repository, with an image-based tutorial here!