OOD project Jan 8 goals

ztybigcat commented 6 months ago

Previously: We attached an autoencoder to transformer model and fined-tuned it on in domain dataset. The idea is autoencoder will be able to reconstruct id prompts but not ood ones. The issue: the autoencoder maps all the in domain data into tiny little space results in similar embeddings for all in domain prompts.

Create centroids from the reconstructed embeddings, calculate reconstructed embeddings' Mahalanobis distance and see if that would result in better performance than plotting reconstruction error directly.
Combine CE losses and autoencoder reconstruction losses and train the model in an iterative fashion

aanchan commented 4 months ago

OOD_experiment_result (1).xlsx

ztybigcat commented 4 months ago

OOD_experiment_result.xlsx

ztybigcat commented 4 months ago

Feb 20 Update: migrated workflow from local pc to Cerence cluster, the cluster runs on cuda11 so needed to reconfigure environment to make it work. Luckily the cluster has so many gpus available so running it was fast. Ran experiments based on discussion on Thursday and the results was not improving much. Observed interesting changes in CE and Euclidean distances based on plots in tensorflow. The Ed loss does not decrease in the first few epochs, but CE loss went up after a few epochs and driving the total loss up. Expecting discussion on Thursday.

SlangLab-NU / clinc150-ood

OOD project Jan 8 goals #1