jbkinney / mavenn

MAVE-NN: genotype-phenotype maps from multiplex assays of variant effect
MIT License
24 stars 5 forks source link

`tutorial_0_built-in_examples` training set #11

Closed mahdikooshkbaghi closed 4 years ago

mahdikooshkbaghi commented 4 years ago

Hi, with mavenn 0.22, in the following cell we have KeyError: 'training_set'.

# indices of training examples
i_training = mpsa_df['training_set']

# get test examples.
mpsa_test_df = mpsa_df[~i_training]

print('Test mpsa values:')
mpsa_test_df.head()

The mpsa_df = mavenn.load_example_dataset(name='mpsa') returns the pandas dataframe with the following columns:

set tot_ct ex_ct y x
training 28 2 -3.273018 GGAGUGAUG
training 315 7 -5.303781 AGUGUGCAA
test 193 15 -3.599913 UUCGCGCCA
validation 27 0 -4.807355 UAAGCUUUU
training 130 2 -5.448461 AUGGUCGGG

The solution was change the line to the following:

mpsa_test_df = mpsa_df[mpsa_df['set']=='test']
atareen commented 4 years ago

Hi Mahdi, this is set to be updated in the next release. Thanks for checking.

atareen commented 4 years ago

Hi Mahdi, I've updated this tutorial as part of the new release, can you try again?

mahdikooshkbaghi commented 4 years ago

Hi Ammar, I can confirm the Notebook example run from the beginning to the end without any issue with mavenn 0.23. Thanks.