fani-lab / SEERa

A framework to predict the future user communities in a text streaming social network based on the users’ topics of interest.
Other
4 stars 5 forks source link

ECIR2021.Tweet Length Matters: A Comparative Analysis on Topic Detection in Microblogs #33

Open soroush-ziaeinejad opened 2 years ago

soroush-ziaeinejad commented 2 years ago

Why did I choose this paper? Because it analyzes the effect of tweet length on topic modeling methods.

Main problem:

Which model is better for topic detection in the short text (tweet)? Does the length of tweets affect the performance of topic detection methods?

Existing work:

The main issue with most of the common topic detection methods is that they are basically designed and trained for extracting topics from the regular text (in terms of the number of words).

Inputs:

Tweets

Outputs:

Method:

Preprocessing (data cleaning):

Given the preprocessed tweets to the models, F-measure is calculated to determine the performance. After that, tweets are pooled regarding their length and training for top-4 models is done for each pool. Results show that does the length of a text really affect the performance of topic modeling methods or not.

Experimental Setup:

Baselines:

Results:

Code:

The code of this paper is unavailable. Dataset is available on: https://github.com/avaapm/ECIR2021

Presentation:

There is no available presentation for this paper.

hosseinfani commented 2 years ago

@soroush-ziaeinejad where is the body?!

soroush-ziaeinejad commented 2 years ago

@soroush-ziaeinejad where is the body?!

Will be added today. I wanted to put them in to-do list now.