miguelalba / demois

Survey and benchmark of data set and instance selection algorithms.
MIT License
1 stars 1 forks source link

Huge memory requirements #1

Open AlyaMF opened 2 years ago

AlyaMF commented 2 years ago

Hi,

I'm trying to test the Drop3 algorithm that you provided with the MINST data set and I did some modifications to the code to fit the data, Every time I run the code the session crashed due to the High RAM requirement (I have 27 GB). I used the whole training set (60.000) without batching. was that normal ?? Did I need to use an Extra space? or use batches? Could you help to figure out what's the issue?

miguelalba commented 2 years ago

Hi Alya,

thanks for the feedback! I am sorry but I wrote this project for my computer science thesis 7-8 years ago and haven't touched it since. I do not even have access to the same computer. I will try to look into it this week.

Regarding your questions. The instance selection algorithms included except for demoIS scale really poorly. They are comparing though K-NN all the data points. That was the whole point of the demoIS algorithm that tries to scale linearly IIRC with concurrent batches that are later aggregated. Maybe that would work for you with the MINST data set. It's a pity my thesis is in Spanish. It's well documented there.

Best regards, Miguel

AlyaMF commented 2 years ago

Hi Miguel, Thank you for replying I appreciate your help. I wish really that I could read your thesis but at least I understand now that the instance selection algorithms perform poorly since they are comparing the K-NN of all the instances. It took me around 15 hours of training, it is really hard to repeat the experiment each time but I am trying to work around this issue.

Thank you again for your help. Alya M