braceal / molecules

Machine learning for molecular dynamics.
MIT License
5 stars 5 forks source link

Run tsne callbacks in a subprocess #49

Closed braceal closed 3 years ago

braceal commented 4 years ago

Could make tsne plotting module.

Make impl functions for 2d,3d that can be called as a normal python function. Pass in a h5 or npy file path to embedding coordinates and have the function save the plot to save_path, write to tensorboard,wandb, etc. EX: plot_tsne2d_impl(embeddings_path, save_path, wandb=None, tensorboard_writer=None, **kwargs)

Then make a click interface to the 2d, 3d tsne impls. '2d' vs '3d' can be a CLI param.

Now make another function that uses the subprocess module to call the click CLI (below)

def plot_tsne(embeddings_path, save_path, plot_dim, subprocess=False, **kwargs):
    if subprocess:
        # call click CLI with subprocess module
   else:
       if plot_dim == '2d':
           plot_tsne2d_impl(...)
      elif plot_dim == '3d':
          plot_tsne_3d_impl(...)

Note: both the click interface and the subprocess interface are very small functions... just passing args basically.

This makes the callbacks very simple. They only need to save the embeddings to disk as a npy file (or something) and then call the plot_tsne function with subprocess=True. In fact, this way we only need 1 tsne callback and we can specify 2d,3d,both as an input parameter. This way the embeddings file won't get saved twice.

Putting the 2d,3d together is not the most general approach e.g. what if a future callback also needs the embeddings. There is probably a better solution to saving the embeddings for use with multiple callbacks...

braceal commented 3 years ago

Done.