Saving Off Model Embeddings

byu-dml / d3m-dynamic-neural-architecture

1 stars 1 forks source link

Saving Off Model Embeddings #201

Open epeters3 opened 4 years ago

epeters3 commented 4 years ago

This is the feature request for saving off the embeddings of the metamodels. Here is a list of all the deep learning metamodels:

dna_regression
lstm
daglstm_regression
hidden_daglstm_regression
attention_regression
dag_attention_regression
mlp_regression

❔ What is a good way to perform the saving of the embeddings? One option would be to add a --save-embeddings flag to the dna evaluate CLI.

epeters3 commented 4 years ago

What are your thoughts @bjschoenfeld?

bjschoenfeld commented 4 years ago

We want embeddings from the pipeline encoding portions of the deep learning models. I have not looked at the code here for a while, so I am not sure if adding a method to dna.models.base_models.PyTorchModelBase will take care of it all.

epeters3 commented 4 years ago

Just to clarify, we don't want embeddings that include information about the dataset or score? Just the pipelines?

bjschoenfeld commented 4 years ago

Yes, let's start with pipeline embeddings that don't include dataset information. I am not sure how we would exclude the score information. That is the only thing used to train the networks. We could come up with an unsupervised method, but things are not setup for that yet.

epeters3 commented 4 years ago

In our team meeting last week, we decided that a good approach for this would be to have a separate CLI that will load a model's weights, then create embeddings for a dataset you point it to, using the model initialized with those weights. It should compute just on the pipeline embedding portion of the dataset.