Closed x12901 closed 1 month ago
Hi x1201, thanks for your interest!
Off the bat, 8000 images is a lot and way more than I've benchmarked. Here is a short overview of where the bulk of the memory goes for each method:
for each backbone, the more feature maps it returns, the more heavy the computations will be.
self.zlib <-- stack of feature vectors, float32 self.fmaps <-- list of stacks of feature maps, float32. This will become the bulk of the memory.
solution: use something like mmap (Memory-mapped file support) to build it outside RAM. faiss could also be used. You will likely sacrifice some inference speed.
self.patch_lib <-- stack of 2D patches, float32. This will become the main bulk of the memory. torch.linalg.inv(self.E) <--- this could cause memory issues if your 2D grid is large
solution: online calculation of mean and covar matrix when the samples are added to the training set
self.patch_lib <-- collection of patches, float32. This will become the main bulk of the memory. coreset selection <-- can also eat quite some memory as you will need to calculate distances between vectors
solution: the authors of the paper use faiss in their implementation, likely because it solves a couple of memory issues as well.
In a nutshell:
Second, I see you are using Streamingdataset. For this to work you'd need to make a train instance and a test instance:
train_dataset = StreamingDataset()
test_dataset = StreamingDataset()
Then you can add samples like this (they are automatically transformed correctly):
for path in train_paths:
train_dataset.add_pil_image(
Image.open(path )
)
for path in test_paths:
test_dataset.add_pil_image(
Image.open(path )
)
For inference on test images, you then call:
test_idx = 0
sample, *_ = test_dataset[test_idx ]
img_lvl_anom_score, pxl_lvl_anom_score = model.predict(sample.unsqueeze(0))
Let me know if it works out!
It worked, thanks
Hi, i'm working on a project with your padim implementation. I reached the same issue of exploding memory usage. I've reviewed the code and something is not clear to me. Why do you reduce the embedding size only after stacking them and not before stacking it? basically, why build a, for example, stack of 1700 features maps and then rand_sample instead of sampling the maps and then stacking? The only thing I see you do (still studying why) is that you keep track of the indexes sampled. Also why you compute the means over the whole patch_lib and not on the reduced version? I don' understand the reason to.
In any case, thanks for the great repo
Hi, i'm working on a project with your padim implementation. I reached the same issue of exploding memory usage. I've reviewed the code and something is not clear to me. Why do you reduce the embedding size only after stacking them and not before stacking it? basically, why build a, for example, stack of 1700 features maps and then rand_sample instead of sampling the maps and then stacking?
Doing rand sample before stacking is actually a good idea.
The only thing I see you do (still studying why) is that you keep track of the indexes sampled.
You have to keep it around for when you do inference on a new sample and want to compare it.
Also why you compute the means over the whole patch_lib and not on the reduced version? I don' understand the reason to.
Valid point. Will update this.
Hi again, I've actually did some implementation that in my case lead to a ram peak usage that is just 1/4 of the previous one. Unfortunately I'm having a bug that I don't know if depends on my data or on my code modification. I will keep you posted and I will share the code. Have a nice weekend
Hi again, I've actually did some implementation that in my case lead to a ram peak usage that is just 1/4 of the previous one. Unfortunately I'm having a bug that I don't know if depends on my data or on my code modification. I will keep you posted and I will share the code. Have a nice weekend
I just added your suggestions, you should pull the latest commit and see if it improves your setup.
Hi, I've seen the changes you made, and they are 90% the same as mine, 🤣. The only difference is that I try to handle the 'self.r_indices' variable that is used later on. In your current code if 'self.d_reduced' is smaller than the embedding dimension it is left as 'None' (a very rare but possible case). This can cause an error in the forward function during inference. Here is my version
OT: I still have to properly test this as my current setup leads to other problems that I think are caused by my dataset leading to NaNs in the matrixes. I will keep you posted if I find something.
Hi, great project! I have 8000 images, and I found that the memory increased a lot during training. My computer has 60G RAM but it is still not enough.
The test picture does not seem to need to be transformed. Does it support pictures in other formats?