JanTeichertKluge / DMLSim

This library provides packages on DoubleML / Causal Machine Learning and Neural Networks in Python for Simulation and Case Studies.
6 stars 0 forks source link

dataset #3

Open Yihuahai opened 4 days ago

Yihuahai commented 4 days ago

What should the format of this dataset be, and are there any instance datasets to refer to

JanTeichertKluge commented 3 days ago

Hi Yihuahai, thanks for your question! This Repository provides a collection of several classes and methods I've used for my masterthesis. The methods provided in the dml_sim submodule expects a data generating process as a callable. The dml_emb submodule provides methods to generate low dimensional embeddings based on image and text as inputs and expect a pandas DataFrame. Please refer to the Docstrings written down in the class headers, e.g. dataset (pandas.DataFrame): the input dataset.. These embeddings can be used as confounders in several settings according to the DoubleML package: https://docs.doubleml.org/ For more information on double machine learning with multimodal confounders please refer to our latest paper: https://arxiv.org/abs/2402.01785

Thanks and have a nice day! BR Jan