zohreh-aaa / DNN-Testing

DNN
4 stars 1 forks source link

Dimension Reduction with umap #2

Closed omeryasar closed 3 months ago

omeryasar commented 3 months ago

Hi, Can you explain me why you are reducing features with different UMAP configurations? And also why you are adding prediction and ground truth after completing dimensionlaity reduction? fit = umap.UMAP(min_dist=n_n, n_components=i, n_neighbors=k) u1 = fit.fit_transform(X_features) fit = umap.UMAP(min_dist=0.1, n_components=j, n_neighbors=o) u = fit.fittransform(u1) u = np.c[u, TY_scaled, PY_scaled]

I really couldnt understand the reason behind reducing feature dimensionality in 2 steps?

Thanks for your help in advance

zohreh-aaa commented 3 months ago

Hi,

By performing Hyperparameter tuning using Optuna and conducting various experiments, we found that this approach (2 reductions) yielded the best results in terms of clustering accuracy on our data. However, you can also use single-step or other dimensionality reduction techniques, as they are optional choices.

Regarding the ground truth, as mentioned in our paper, we included two vectors to extract information from the model under test. Since the features are derived from VGG16 (a pertained model) which does not relate to our model under test, we aimed to gain and add some information from our model under test to assist in clustering. These two vectors of ground truth and predicted label helped the clustering have some features coming from the model under test to give more weight to them, we added them after the reduction and also they are not the same type as other features that are all coming from VGG16. For more details on the fault estimation part, please refer to our first paper: Blackbox Testing of DNN.

Thank you!