GateNLP / gate-core

The GATE Embedded core API and GATE Developer application
GNU Lesser General Public License v3.0
78 stars 29 forks source link

closing documents in the UI is slow #140

Open greenwoodma opened 3 years ago

greenwoodma commented 3 years ago

If I populate a corpus from a file of 10,000 Tweets (wihtout using a datastore) it takes less than a minute to create all the documents and add them to the resource tree. If I then select them all and try and close them it takes an awfully long time (I gave up after about 10 minutes). There seems to also be a lot of CPU activity (50% on my laptop) but almost no GC so I don't think this is related to freeing memory as documents are removed. I know the easy answer is use a datastore but it still seems odd that removing the documents is so much slower than loading them.

johann-petrak commented 3 years ago

Just a stab into the dark, I did not look at the code: Could it be that the GUI is getting updated for each document that gets removed, rather than only once after everything has been done?

greenwoodma commented 3 years ago

Yes, the GUI gets updated after each doc is removed, but it also gets updated after each document is added. Now unless adding a node to a tree is a lot quicker than removing it, something else must be going on as well. I suppose at some point I should try a test from the API without the GUI to see what happens.