Open xmatthewx opened 8 years ago
@auremoser – can you add some quick notes on the tool and process you identified?
yay
count of
Super dupe, requirements and ideas are in the readme along with the data now, in this github: https://github.com/auremoser/mofolondon-ether.
more counts:
Might be best to use this as a light pilot, only 20 etherpads have content. I'm not sure folks knew there were pads. It's a shame! We should push it next time.
I'll add more soon!
Thanks again for pushing this forward @auremoser. A couple thoughts:
duds.csv
Cool @xmatthewx ! I'm off today but going to get back on this tomorrow. Yes I should have made headers for the csvs! bad formatting. I would like to make it so all you have to do is run a few scripts, for mozfest etcet and you get some basic stats for your etherpads. :) I'll get back on it tomorrow! Thanks for the feedback!
I did a quick look and found:
Then I dug into top words, filtering out common words (might have been a bit too aggressive using "custom 9300".)
127
, mozfest 126
, web 72
, (internet 6
)115
as http 58
I removed names and uninteresting words, then tried to form a bit of a narrative:
47
36
18
15
12
11
7
7
7
6
30
25
28
20
9
7
22
15
13
10
8
7
6
24
19
10
10
9
9
8
8
8
6
31
31
18
16
11
9
8
8
8
6
29
15
13
11
11
10
9
9
8
8
8
7
7
5
20
13
14
12
9
6
5
5
5
4
74
19
16
11
11
11
6
5
5
Yay!!!
a huge thanks to you @auremoser for making this possible. 👏 🚀 🌚 🎸
We should do an unfiltered search for issue names and synonyms. Maybe also for words that express confidence, excitement, concern. Here's a site for finding text analysis tools, such as examining sentiment. No idea if these are the best options.
@samanthaburton – it'd be useful to have keywords for the issues. Can you point us to the summary language you had on your nest posters? Thanks!
Again – this is just a demo of what we could do, if we get better at documenting things with tools like Pulse and the Event App.
cc @edrushka @kristinashu @mmmavis
@auremoser this is so COOL! 👍 I especially love the idea of appending /export/txt
to each pad's URL! What an awesome hack - I was trying to figure out how Etherpad API works & how I can acquire a API key... and now the /export/txt
totally solves the problem!
Summary language is in the updated 2020 strategy!
We don't have 'official' keywords yet, but @KevZawacki is going to be working on defining some for each issue to help our monitoring & evaluation efforts - so, depending on the timeline for this analysis, that work could be helpful.
Thanks @xmatthewx for the idea and the narrative breakdown :), and yah @mmmavis !! one of our fellows, richard, suggested that hack to me :)
Thanks @samanthaburton. It's going to be challening to count keywords with high confidence that they're related to the issues. Insights are tricky!!
We could (1 )find keywords, (2) grab surrounding text, then (3) analyze for relevance.
Keywords directly from issue statements:
@xmatthewx
Got something up just for us to play around: http://allhands-etherpads-analyzer.herokuapp.com/ We'll need to spend effort on data mining to make raw Etherpad data useful though 👀
Word Cloud: http://allhands-etherpads-analyzer.herokuapp.com/wordcloud ☁️
really cool @mmmavis !
Here's some info we could try to extract from quick brainstorm with @gvn.
easy:
harder:
hardest: