Write Proposal - Githubissues

pmiri commented 7 years ago

Write 1 pager for Prof to approve. This serves as a way for us to all agree on the same thing, too.

sophiemongrain commented 7 years ago

First draft pushed to /docs/proposal

pmiri commented 7 years ago

Are we only seeking to test our hypothesis with Karma? What about amount of replies gathered? Total unique users who replied? As so on.

p-hebert commented 7 years ago

Should we also do extra analysis and compare subreddits against each other in terms of karma per capita or thread, connectedness of the subreddits, and mood detection on the threads?

Aside from that, great writeout. I'm actually much more hyped now :3

sophiemongrain commented 7 years ago

Mood detection (the industry phrase is "sentiment analysis") would be a lot of fun actually and a very interesting metric to compare with connectedness. There's a Node lib here: https://github.com/thisandagain/sentiment that I haven't evaluated but might be promising.

p-hebert commented 7 years ago

I'm not afraid that we'll find a cool free library for that :)

p-hebert commented 7 years ago

Actually, mood detection may be crucial to understanding connectedness - it may allow us to rate connections as enemy or friendly :O

sophiemongrain commented 7 years ago

If you think there's a scalable way to add metadata about sentiment to each individual comment, that would defo be dope and would let us label edges in the graph as positive/negative. Who knows where that might lead but it sounds promising

p-hebert commented 7 years ago

I'm sure there is. If we want to push even further we can use sentence deconstruction to check if anymosity/friendliness is directed at one of the users or simply at the topic. This requires that we have an engine for NLP, which is more work, and we have to set limits at some point. We can also assert the probability of the sentiment to be directed at an individual

sophiemongrain commented 7 years ago

Would be cool, but determining the direction of comments is probably where we hit the edge where we go beyond what's required for an A. I also think it's safe enough for the purposes of our model to assume that most comments with a depth of 1 that have a negative sentiment are upset about the OP, and that those with any lower depth are upset about the comment they're replying to.

Another metric might involve a combination of sentiment and karma (eg, do some subs award more karma for negative comments and some for positive ones), this is very fertile territory from an analysis point of view

frazs commented 7 years ago

re: To accomplish this, we will gather data from the popular social network Reddit and use them to construct language models that will pose as commenters, and examine to what extent our automated agents can ``pose'' as real members of a particular community.

"language models that will pose as commenters" sounds confusing to me. Do you mean something like "language models for bots that will pose as commenters"?

Also, how many agents are we talking about? Will there be a bot per sub-reddit, per sub-reddit group (e.g. right-wing bot, neutral bot, left-wing bot), or one bot for them all? Similarly, is language modelling per subreddit, per group, or the whole thing?

re: To reduce the quantity of data we need to process as well as to obtain higher-quality (and more representative) data, we will focus our attention on posts that score above a minimum karma threshhold, which will be determined based on the average karma of the subreddit the post originates from.

What is "average karma"? The average karma of all comments on all posts of that subreddit? Some sort of representative sample? Or all the comments on a specific post at the time of the scan (& we scan posts only with x amounts of comments)?

The rest looks good & clear to me. There's still a couple typos e.g. threshhold -> threshold and weight -> weigh; I'll do a thorough proofread pass once our content is finalised.

sophiemongrain commented 7 years ago

Incorporated your suggestions Frances (except for "weight", which is correct in this context (https://en.wiktionary.org/wiki/weight#Verb))

pmiri commented 7 years ago

Submitted

pmiri commented 7 years ago

Adding references

pmiri commented 7 years ago

References added:

ML (per @mmongrain )

ftp://ftp.idsia.ch/pub/juergen/TimeCount-IJCNN2000.pdf
http://jmlr.org/proceedings/papers/v37/jozefowicz15.pdf
https://arxiv.org/pdf/1503.04069.pdf
Political Bots
http://ijoc.org/index.php/ijoc/article/view/6135
http://www.gm.fh-koeln.de/~hk/lehre/sgmci/ss2012/material/Socialbots_VoicesFromTheFront.pdf

sophiemongrain commented 7 years ago

crushed it brethren and sistren

pmiri / bloviator

Write Proposal #7

ML (per @mmongrain )

Political Bots