Open PeterKim1 opened 2 years ago
Hello.
I want to apply this model to large image dataset. (I have over 10,000 images)
But RAM memory issue arise.
https://github.com/hcw-00/PatchCore_anomaly_detection/blob/main/sampling_methods/kcenter_greedy.py#L95
self.features = model.transform(self.X)
I think this code puts all the data embedding into RAM memory and apply SparseRandomProjector, which seems to put a lot of
pressure on RAM memory.(I'm just novice, so this may be wrong.)
Does anyone know how to solve this problem?
One idea i have is to split the data in half and apply the SparseRandomProjector to each of them, but I think it might cause problems
because SparseRandomProjector determines the dimensionality of embeddings based on Johnson-Lindenstrauss lemma.
According to sklearn document(https://scikit-learn.org/stable/modules/generated/sklearn.random_projection.SparseRandomProjection.html), n_components can be automatically adjusted according to the number of samples in the dataset.
Hello.
I want to apply this model to large image dataset. (I have over 10,000 images)
But RAM memory issue arise.
https://github.com/hcw-00/PatchCore_anomaly_detection/blob/main/sampling_methods/kcenter_greedy.py#L95
self.features = model.transform(self.X)
I think this code puts all the data embedding into RAM memory and apply SparseRandomProjector, which seems to put a lot of
pressure on RAM memory.(I'm just novice, so this may be wrong.)
Does anyone know how to solve this problem?
One idea i have is to split the data in half and apply the SparseRandomProjector to each of them, but I think it might cause problems
because SparseRandomProjector determines the dimensionality of embeddings based on Johnson-Lindenstrauss lemma.
According to sklearn document(https://scikit-learn.org/stable/modules/generated/sklearn.random_projection.SparseRandomProjection.html), n_components can be automatically adjusted according to the number of samples in the dataset.