snap-stanford / GEARS

GEARS is a geometric deep learning model that predicts outcomes of novel multi-gene perturbations
MIT License
189 stars 38 forks source link

simulation_single option with pert_data.prepare_split gives ValueError #72

Open murthy1770 opened 3 months ago

murthy1770 commented 3 months ago

pert_data.prepare_split(split = 'simulation_single', seed=1) # get data split with seed pert_data.get_dataloader(batch_size = 32, test_batch_size = 128) # prepare data loader

This gives Value Error. I am using a custom dataset with single gene perturbations only (CROP-Seq). What is the difference between simulation and simulation_single? If you have a dataset with single gene perturbations, what should one use?


ValueError Traceback (most recent call last) Cell In[6], line 1 ----> 1 pert_data.prepare_split(split = 'simulation_single', seed=1) # get data split with seed 2 pert_data.get_dataloader(batch_size = 32, test_batch_size = 128) # prepare data loader

File ~/.conda/envs/biomodels/lib/python3.12/site-packages/gears/pertdata.py:355, in PertData.prepare_split(self, split, seed, train_gene_set_size, combo_seen2_train_frac, combo_single_split_test_set_fraction, test_perts, only_test_set_perts, test_pert_genes, split_dict_path) 351 if split in ['simulation', 'simulation_single']: 352 # simulation split 353 DS = DataSplitter(self.adata, split_type=split) --> 355 adata, subgroup = DS.split_data(train_gene_set_size = train_gene_set_size, 356 combo_seen2_train_frac = combo_seen2_train_frac, 357 seed=seed, 358 test_perts = test_perts, 359 only_test_set_perts = only_test_set_perts 360 ) 361 subgroup_path = split_path[:-4] + '_subgroup.pkl' 362 pickle.dump(subgroup, open(subgroup_path, "wb"))

ValueError: too many values to unpack (expected 2)

domcke commented 1 month ago

I have the same issue.