adamhong / maui-indexer

Automatically exported from code.google.com/p/maui-indexer
0 stars 0 forks source link

Key extraction against LCSH skos causes OutOfMemoryError #5

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hi, 
I'm trying to do keyphrse extraction for my documents against LCSH skos  
dictionary (about 400 MB). The problem I'm experiencing is an out of memory...I 
had a look at the code and as far I see it seems the cause is the (huge) 
in-memory vocabulary object.
Is there a workaround for that? I tried to use a RDBMS-based persistent Jena 
model but I still have the mentioned problem when the module tries to build the 
vocabulary loading all the model statements...

Any thought?

Best regards,
Andrea

Original issue reported on code.google.com by a.gazzarini@gmail.com on 30 Jun 2010 at 5:40

GoogleCodeExporter commented 8 years ago
Did you try to increase the memory that your java application has access to?
-Xmx400m

Original comment by medel...@gmail.com on 30 Jun 2010 at 8:51

GoogleCodeExporter commented 8 years ago
Hi,
Yes I tried...but unfortunately that doesn't solve the issue...in the better 
scenario I don't have an OutOfMemory but It tooks hours to build the whole 
vocabulary.
I was wondering if it's possible to lazy build the vocabulary (on demand) 
instead of creating that all at startup.

Cheers,
Andrea

Original comment by a.gazzarini@gmail.com on 1 Jul 2010 at 6:36

GoogleCodeExporter commented 8 years ago
Anyway, congratulations for the project...it seems very interesting.
I'm very busy at the moment but I'd like to contribute in some way.
This is me : http://www.linkedin.com/in/andreagazzarini

Regards,
Andrea

Original comment by a.gazzarini@gmail.com on 1 Jul 2010 at 6:39

GoogleCodeExporter commented 8 years ago
Hi Andrea,

I have been using a LCSH file. In compressed form, as taken in by Maui, it is 
less than 10MB and no memories issues encountered. I can send it to you if you 
want to try it out.

You do have a good point of a on-demand vocabulary build up. I couldn't find a 
way of doing it so far. One could write an alterantive Vocabulary class that 
operates on a vocabulary from a MySQL database, for example, but that would 
make Maui more complex in my view.

Cheers
Alyona

Original comment by medel...@gmail.com on 1 Jul 2010 at 11:42

GoogleCodeExporter commented 8 years ago
Hi.
I'm curious if there are any research papers (or additional info) on using 
maui-indexer with the LCSH skos dictionary.
Thanks!

--Brad

Original comment by brad.mac...@gmail.com on 25 Jun 2014 at 9:18

GoogleCodeExporter commented 8 years ago
Hi Brad, not much. A few words on this topic here: 
http://www.medelyan.com/files/jcdl10_subject_metadata_support_maui.pdf

Original comment by medel...@gmail.com on 25 Jun 2014 at 9:27