Open ztybigcat opened 6 months ago
Feb 20 Update: migrated workflow from local pc to Cerence cluster, the cluster runs on cuda11 so needed to reconfigure environment to make it work. Luckily the cluster has so many gpus available so running it was fast. Ran experiments based on discussion on Thursday and the results was not improving much. Observed interesting changes in CE and Euclidean distances based on plots in tensorflow. The Ed loss does not decrease in the first few epochs, but CE loss went up after a few epochs and driving the total loss up. Expecting discussion on Thursday.
Previously: We attached an autoencoder to transformer model and fined-tuned it on in domain dataset. The idea is autoencoder will be able to reconstruct id prompts but not ood ones. The issue: the autoencoder maps all the in domain data into tiny little space results in similar embeddings for all in domain prompts.