narratives-of-war / topic-rnn

Implementation (in progress) of Dieng et al.'s TopicRNN intended to be used as a baseline and starting point.
MIT License
10 stars 4 forks source link

Top words per topic? #1

Open LosSherl opened 6 years ago

LosSherl commented 6 years ago

How to get top words per topic using this model?

dangitstam commented 6 years ago

self.beta is a K x V matrix where K is the number of topics and V is the size of the vocabulary. To get the top words per topic atm, you'd have to collect that matrix after training, map indices to their values, sort by the values, and print the words the indices map to.

Much of the project is still under development! Built-in functionality to print the top words per topic is on its way.

LosSherl commented 6 years ago

Thank you! May I ask how long it will take to finish?

dangitstam commented 6 years ago

Not quite sure yet; I've yet to run experiments to make sure we can similar results to the paper.

The original paper is under review; once it's approved the authors plan on releasing the code so keep an eye out for that as well.

LosSherl commented 6 years ago

Altrer running this cmd:

python train_ model.py --model-type topic

I got these;

GPU available but not running with CUDA (use --cuda to turn on.) Building corpus: Restricting vocabulary based on min token count 10 Collecting stopwords: Collecting War Wikipedia JSONs: Vocabulary Size: 5388 Stop size: 523 Vocab size no stops: 4865 Building topic RNN model ------------------ Training in progress: Traceback (most recent call last): File "train_model.py", line 431, in main() File "train_model.py", line 185, in main args.cuda) File "train_model.py", line 284, in train_topic_rnn stop_indicators, cuda) File "E:\Design\topic-rnn\topic_rnn_rc\models\topic_rnn.py", line 143, in like lihood mapped_term_frequencies = self.g(Variable(term_frequencies)) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 206, in call result = self.forward(*input, kwargs) File "E:\Design\topic-rnn\topic_rnn_rc\models\topic_rnn.py", line 229, in forw ard output = self.model(term_frequencies) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 206, in call result = self.forward(*input, *kwargs) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\container.py ", line 64, in forward input = module(input) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 206, in call result = self.forward(input, kwargs) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\linear.py", line 54, in forward return self.backend.Linear.apply(input, self.weight, self.bias) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\nn_functions\linear.py ", line 12, in forward output.addmm(0, 1, input, weight.t()) RuntimeError: matrices expected, got 1D, 2D tensors at d:\downloads\pytorch-mast er-1\torch\lib\th\generic/THTensorMath.c:1233

Is there anything wrong?

dangitstam commented 6 years ago

Hi @LosSherl, the train_model.py expects data to be in a very particular format: a format that I haven't had the chance to write a README for.

The model itself is still in development (it runs properly for the JSON format it was meant for but we haven't confirmed that it matches the results of the paper).

When I've had a chance to outline the JSON format needed as well as documenting the code in general, I'll be pushing it to master. Until master has been updated, the code is experimental and not supported for users.

LosSherl commented 6 years ago

I was using dataset acquired by scirpts in this repo: narratives-of-war/data-collection. I though it would work and it worked if the parameter --model-typle was default vanilla.

LosSherl commented 6 years ago

Hi @dangitstam , was it really because the data format? Here is an example of data I used: { "sections": [ { "heading": "Introduction", "text": "On 18\u201319 October 1965, a group of ethnic Hutu officers from the Burundian military attempted to overthrow Burundi's government in a coup d'\u00e9tat. The rebels were angry about the apparent favouring of ethnic Tutsi minority by Burundi's monarchy after a period of escalating ethnic tension following national independence from Belgium in 1962. Although the Prime Minister was shot and wounded, the coup failed and soon provoked a backlash against Hutu in which thousands of people, including the participants in the coup, were killed. The coup also facilitated a militant Tutsi backlash against the moderate Tutsi monarchy resulting in two further coups which culminated in the abolition of Burundi's historic monarchy in November 1966 and the rise of Michel Micombero as dictator.\n\n\n" }, { "heading": "Background", "text": "In 1962, the Belgian mandate of Ruanda-Urundi received independence, creating the Republic of Rwanda and the Kingdom of Burundi. Both states had traditionally had monarchies dominated by the Tutsi ethnic group over a Hutu ethnic majority but Rwanda's monarchy was abolished by a political revolution in 1959-61. In the first years of independence, Burundi seemed to have achieved a balance between ethnic groups which brought members of the different ethnic groups into government, moderated in part by the mwami (king) Mwambutsa IV who was popular with all groups but was himself Tutsi. Both Tutsi, Hutu and Ganwa were part of the dominant political party, the Union for National Progress (Union pour le Progr\u00e8s national, UPRONA). In October 1961, shortly before the date scheduled for independence, the Burundian Prime Minister Prince Louis Rwagasore was assassinated, raising ethnic tensions in the country. After a period of rule by Tutsi prime ministers, Mwambutsa appointed Burundi's first Hutu leader, Pierre Ngendandumwe, but Ngendandumwe was assassinated in January 1965 by a Rwandan Tutsi. Elections held in May 1965 took place in an atmosphere of strong ethnic tension. Hutu candidates gained a majority, but Mwambutsa deposed the Hutu Prime Minister Joseph Bamina and instead installed a Tutsi candidate, L\u00e9opold Biha, in October 1965.\n\n\n" }, { "heading": "Coup and aftermath", "text": "The installation of Biha as Prime Minister created dissent between Hutus and the Burundian monarchy. A group of Hutu officers in the army attempted a coup d'\u00e9tat against the Tutsi-led government on 18\u201319 October 1965.\nA small group of Hutu members of the Army and Gendarmerie marched on the Royal Palace. Biha was shot and wounded. The coup was foiled by troops led by the Tutsi military officer, Michel Micombero. They were led by Gervais Nyangoma, a parliamentarian, and Antoine Serukwavu, a Gendarmerie commander, were the leaders. 34 Hutu soldiers who had been involved in the coup were arrested and executed. Hutus in the military and police who had not taken part were also arrested and many killed. The coup's failure provoked immediate ethnic violence across the country in which thousands of people, mainly Hutus, were killed in what has been seen as a prelude to the Burundian genocide of 1972.\nAs a result of the coup, Mwambutsa fled into exile and never returned to Burundi. The failure of the Hutu military coup d'\u00e9tat created a Tutsi counter-reaction and laid the foundation for extreme Tutsi factions to seize power for themselves, first installing a new mwami and later abolishing the monarchy altogether in November 1966. Micombero, later promoted to Prime Minister, would lead the second coup d'\u00e9tat and become Burundi's first republican President and de facto dictator until 1976.\n\n\n" }, { "heading": "References", "text": "\n\n" }, { "heading": "Bibliography", "text": "Lemarchand, Ren\u00e9 (1995). Burundi: Ethnic Conflict and Genocide. New York: Woodrow Wilson Center Press. ISBN 0-521-45176-0. \n\n\n" } ], "title": "1965 Burundian coup d'\u00e9tat attempt" }

Thanks