stanfordnlp / pyvene

Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
http://pyvene.ai
Apache License 2.0
609 stars 59 forks source link

[P1] Streamlining trainable intervention artifacts saving and sharing #30

Closed frankaging closed 9 months ago

frankaging commented 9 months ago

Description: After training, the intervention's artifacts are saved in memory without a good way of saving to disk with other metadata or sharing on huggingface marketplace. This will be a change to provide a smooth way of saving/sharing interventions trained by users.

The key thing will be serializing metadata into a shareable format (i.e., serializing and deserializing need both be tested). It will still require sharing parties to know the counterfactual dataset generation, but it is less of a problem of this library and more about sharing the dataset itself. And dataset sharing could be a separate process not included in this library.

This change should also consider sharing interventions that contain a vector store (some truthful direction for sharing, etc..).

Testing Done:


Ran 25 tests in 4.280s

OK


- New Tutorial Added `tutorials/basic_tutorials/Load_Save_and_Share_Interventions.ipynb`.