lileipisces / Sentires-Guide

A quick guide to Sentires: Phrase-level Sentiment Analysis toolkit, SIGIR'14
34 stars 5 forks source link

Can we use multiple threads to speedup the POS computation? #2

Open doujiang-zheng opened 3 years ago

doujiang-zheng commented 3 years ago

I run experiments on a 24-core Ubuntu server with 128GB memory. The default reviews_Musical_Instruments_5.json.gz contains 10K line reviews and costs around 17 minutes on my server. Then I try the reviews_Electronics_5.json.gz, another category of Amazon datasets with 1 million line reviews. The latter experiment is stuck at the POS (part of speech) stage and has already run 19 hours. I found that the process had many subprocesses but only ran on a single-core. Could you please help me? Thanks for your reading. image

lileipisces commented 3 years ago

Hi, I really have no idea, because this Java tool was not developed by me. It took me several days to process the Amazon and Yelp datasets used in our CIKM'20 paper. Owing to the processing speed, I even removed users and items with less than 20 records. Maybe you could read the original documents to see whether there is any solution. Please do let me know if you fixed it. Thanks!