-
# Overview
This issue contains preliminary notes on summer UROP work. It condenses many of the collected notes and issues created as the last few groups have picked up work on this project. The goal …
-
### Preliminary Remark
The observations presented here are also relevant for the _polmineR repository._
### Some Background
The _Bundestag Protokolle_ often employ spacing to enhance readability …
-
Note: this issue is reserved for the excellent @jlas :)
**User Story**
**As a** PM managing [ETHPrize](http://ethprize.io/) bounties
**I want to** perform data analysis on the current 82 (and c…
-
Post questions here for this week's fundamental readings: Cornell Conversational Analysis Toolkit (ConvoKit) Documentation: Introductory Tutorial; Core Concepts; the Corpus Model.
OpenAI. 2019. “Be…
-
The scope of the task is to detect some patterns within each class of the dataset, trying to work on the explainability of the models that will be later developed. Approaches:
- class imbalance probl…
-
1. lemmatize the ukwac+wacky corpus using Jobimify tool:
```
frink:/home/panchenko/jobimify
```
- use the concatenation of these corpora http://cental.fltr.ucl.ac.be/team/~panchenko/d…
-
Stream metadatas include the field `Metadata.composer`, which is very relevant for corpus analysis of classical music. But when it comes to analysis of popular music, it might sometime be more relevan…
-
```
Enter a search string
where can i get good pho
Content [I heard the Tofu house is a good restaurant], tfidf [3.135494]
Content [Checkout the Pho restaurant on Broadway], tfidf [3.135494]
Enter a s…
-
NEWSPAPERS
- [ ] Newspapers page: Unclear: which parts are missing within the span of a newspaper
- [ ] No legend: how was this calculated, what are we missing? For @mduering to add to FAQ entries…
-
@sirmarcis how is this different than #30?