lujiaying commented 1 year ago

This issue is for recording the experiment results, and the experiment constraints. Raw CSV results, we wanna save to https://emory-my.sharepoint.com/personal/jlu229_emory_edu/_layouts/15/onedrive.aspx?ga=1&id=%2Fpersonal%2Fjlu229%5Femory%5Fedu%2FDocuments%2FGraph%20Mining%20Lab%2FFA22%5FMUG%5FMMAutoML%2FexpResults Eventually, all results would be collected into https://www.overleaf.com/project/632df1953d99ba1634e61f15

For now, I suggest using Turing as the major machine for benchmarking, *!!! constraint: 8 hours, 16 CPUs (Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz) I will share the scripts about how to set the constraint later.

This can be held until we finish data collection.

Eval Metrics

Following AMLB paper,

we use area under the receiver operating characteristic curve (AUC) for binary classification, log loss for multi-class classification and root mean-squared error (rmse) for regression to evaluate model performance

Therefore, we will report following metrics for binary classification:

acc
auc
f1
log_loss

following metrics for multi-class classification

acc
balanced_acc
mcc
log_loss

Methods

Ensemble-based Methods
- AutoGluon-tabular-best
- AutoGluon-tabularMM-best
Fusion-based Methods (need dive deep into AutoMM)
- AutoMM:

https://auto.gluon.ai/dev/tutorials/multimodal/customization.html#model-names
 # default used by AutoMM
predictor.fit(hyperparameters={"model.names": ["hf_text", "timm_image", "clip", "categorical_mlp", "numerical_mlp", "fusion_mlp"]})
# use only text models
predictor.fit(hyperparameters={"model.names": ["hf_text"]})
# use only image models
predictor.fit(hyperparameters={"model.names": ["timm_image"]})
# use only clip models
predictor.fit(hyperparameters={"model.names": ["clip"]})

GNN-based Methods
- TabGNN (graph constructed by features and heuristics):
- Multiplex Graph Neural Networks, www'22 (graph constructed by feat similarity)

lujiaying commented 1 year ago

AutoMGNN

auto graph construction
- similarity-based graph construction with specified sparsity
- knn-based graph: directed or undirected?
- kernel-based graph: rbf-kernel..
intra-graph node embedding learning
multi-view graph embedding fusion

lujiaying commented 1 year ago

Experiment Design

Single Modality
Fusion of two, three modalities..

TODO: tab + image model: convert tabular feature into 2D pixel picture..

lujiaying / MUG-Bench

Benchmark with Existing AutoML Baselines and Our Own Method #5

Eval Metrics

Methods

AutoMGNN