alexhock / pixplotml

A Visualization Tool for Image-based Machine Learning Projects - Great for object-detection and image classification projects
MIT License
21 stars 3 forks source link

Out of memory #1

Open synthdatagit opened 10 months ago

synthdatagit commented 10 months ago

Hi,

first of all. Really cool repo

I have tried to use it for my own data, but I encounter an issue with memory. My metadata.csv file contains approx. 32000 lines (32000 objects in 2000 images).

When I run pixplot.py I get the following error

Traceback (most recent call last): File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 1677, in parse() File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 1673, in parse process_images(config) File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 220, in process_images ) = load_and_filter_images(kwargs) File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 286, in load_and_filter_images all_loaded_images = load_input_files(kwargs) File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 277, in load_input_files img = Image(path, {"metadata": metadata, "vec": vec}) File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 1483, in init self.original = load_image(self.path) File "C:\work\pixplotml\pixplot_server\pixplot\pixplot.py", line 1477, in load_image return pil_image.open(image_path).convert("RGB") File "C:\Users\xxxxxx\Anaconda3\envs\pixplotml\lib\site-packages\PIL\Image.py", line 923, in convert return self.copy() File "C:\Users\xxxxxx\Anaconda3\envs\pixplotml\lib\site-packages\PIL\Image.py", line 1179, in copy return self._new(self.im.copy()) MemoryError

I have attached the memory usage memory

Do you have any suggestions to working with larger datasets in pixplotml ?

synthdatagit commented 10 months ago

I made a workaround for the memory issue by resizing my original images.

Now another problem occurs.

When I press the yellow enter button on the web gui, nothing happens. Memory usage increases a bit but nothing is displayed..

synthdatagit commented 10 months ago

Update, inspection output when pressing Enter

Uncaught TypeError: Cannot read properties of undefined (reading '0') at World.setBorderColorImages (tsne.js:1484:14) at Labels.updateLabels (tsne.js:3241:9) at Labels.init (tsne.js:3094:8) at Welcome. (tsne.js:3966:17)

marcoaaz commented 1 month ago

Adding to @synthdatagit comments, I also got the yellow button having no action on the webpage. image The terminal reads normally but it does not load the images since no interaction is possible with the webpage. I was able to run the demo files (in the output folder) but, after fine-tuning with my own data, the images that I prepared cannot be run.

I also noticed that the 'image_vectors.npy' files that I produce only have (n_samples x 15) and not (n_samples x 2048) as it is meant to be according to:

image

In my example, the number of unique class_id is equal to 15. There might be a bug in how I am running the 'main.py' code (pixplotml > prep_pixplot_files). I am pretty sure that if I got 2048 arrays, I will be able to run 'pixplot.py' (within pixplotml > pixplot_server > pixplot) and generate the webpage without issues.

I hope you can help me see my data. Thanks!.

Cordially, Marco

marcoaaz commented 4 weeks ago

Dear @alexhock,

I managed to figure out the solution to my issue above. The last stripped off layer of the model was being flattened and provided n_columns = number of classes in 'class_id'. The solution is to replace (in prep_pixplot_files > 'main.py') the following: emb_model.fc = torch.nn.Identity(2048) emb_model = emb_model.to(device) by emb_model = torch.nn.Sequential(*list(emb_model.children())[:-1]) I hope this is helpful for someone else in the future.

Best wishes, Marco