I have a question about NPCA!

ORippler / gaussian-ad-mvtec

Code underlying our publication "Modeling the Distribution of Normal Data in Pre-Trained Deep Features for Anomaly Detection" at ICPR2020

GNU Affero General Public License v3.0

99 stars 18 forks source link

I have a question about NPCA! #3

Closed PeterKim1 closed 3 years ago

PeterKim1 commented 3 years ago

Hello.

Thank you for sharing your research codes!

I have a question about NPCA codes.

If I want to run NPCA 1%, should i set 'variance_threshold' parameter as 0.99?

'variance_threshold' parameter is here : https://github.com/ORippler/gaussian-ad-mvtec/blob/main/src/gaussian/model.py#L208

ORippler commented 3 years ago

Hi,

you need to specify --npca and set the variance_threshold parameter to 0.99, refer here:

https://github.com/ORippler/gaussian-ad-mvtec/blob/main/src/scripts/table4.py#L23-L24

Best,

PeterKim1 commented 3 years ago

@ORippler Thank you for your very fast response!

I have one more question for here - > https://github.com/ORippler/gaussian-ad-mvtec/blob/main/src/gaussian/model.py#L219:L236

For example, if i have 209 samples, and each samples has 2096 dimension vector(it is same as your paper's Fig 1.),

feature matrix has (209, 2096) shape.

And i run PCA.

As you codes, I print pca.components[last_components-1:], but it has (7, 2096) shape.

I think pca.components[last_components-1:] need to have (209, 7) shape.

Is there an error here?

ORippler commented 3 years ago

I think you have a misunderstanding regarding pca.components_, which stores the matrix used for reducing input features to principal components (e.g by performing the correct matmul you can reduce your original 2096 space to just 7, 204 or 209 dimension). In our code, this is done here.

PeterKim1 commented 3 years ago

Wow. Thanks for your kind explanation.

I know PCA theoretically, but i don't have experience to use PCA in sklearn. So i have misunderstanding regarding pca.components_.

I totally understand about your codes. Thank you !