Open PrinceTitiya opened 2 months ago
Suggested coming up with a list of datasets and papers for cross lingual sentiment analysis
We have updated the details of the project.
Looks good. Marking as minor revision. Please explain the metrics a little more, maybe with an example.
Our model learns topics on one language (here, English), and predicts them for unseen documents in different languages. We will evaluate the quality of the topic predictions for the same document in different languages (Italian, Portuguese).
Using the below Evaluation Metrics :
Looks good. Marking as approved
Title
Cross-lingual Contextualized Topic Models with Zero-shot Learning.
Team Name
TeamTatakae
Email
202318024@daiict.ac.in
Team Member 1 Name
Mitul Dudhat
Team Member 1 Id
202318024
Team Member 2 Name
Ayush Patel
Team Member 2 Id
202318036
Team Member 3 Name
Prince Titiya
Team Member 3 Id
202318010
Team Member 4 Name
Hiten Gondaliya
Team Member 4 Id
202318063
Category
Reproducibility
Problem Statement
The paper introduces a zero-shot cross-lingual topic model that learns topics in English and predicts them for unseen languages like Italian and Portuguese, without needing translations. It overcomes the limitations of traditional bag-of-words models, ensuring the transferred topics remain coherent and stable across languages.
Evaluation Strategy
Matches, Centroid Similarity, and KL Divergence.
Dataset
Dataset Link - https://github.com/vinid/data (Dataset was extracted by DBpedia and they have made it publically available on GITHUB.
Resources
Cross-lingual Contextualized Topic Models with Zero-shot Learning. Research paper link - https://arxiv.org/abs/2004.07737