Data4Democracy internal-displacement issues

Data4Democracy / internal-displacement

Studying news events and internal displacement.

43 stars 27 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

scraper

#54 jlln closed 7 years ago
0
simple enhancements to the country code extraction function

#53 simonb83 closed 7 years ago
0
initial unit tests for Interpreter and Scraper

#52 simonb83 closed 7 years ago
1
Enhance country detection in article content

#51 simonb83 closed 7 years ago
6
Fix exception handling in pdf_parsing

#50 simonb83 closed 7 years ago
0
sql to csv function

#49 georgerichardson closed 7 years ago
0
Gr/debug1

#48 georgerichardson closed 7 years ago
0
Issue 41/pdf published date

#47 simonb83 closed 7 years ago
0
Pdfs/43

#46 georgerichardson closed 7 years ago
0
Add functions to test if url is pdf by looking at url...

#45 simonb83 closed 7 years ago
0
Extract country code

#44 simonb83 closed 7 years ago
1
Manage PDF scraping

#43 georgerichardson closed 7 years ago
8
Detect URLs with PDF

#42 georgerichardson closed 7 years ago
2
Extract document details from PDF

#41 georgerichardson opened 7 years ago
10
Filter enhancements

#40 simonb83 closed 7 years ago
0
Incorporate Scrape pdf/17

#39 georgerichardson closed 7 years ago
0
experimental notebook for spacy parsing of article titles

#38 wwymak closed 7 years ago
0
further nlp exploration for methods to identify relevant articles

#37 simonb83 closed 7 years ago
0
implement Filter with method for setting language property of articles

#36 simonb83 closed 7 years ago
1
move files from internal-d to internal_d and delete old directory

#35 simonb83 closed 7 years ago
0
Jlln pipeline

#34 jlln closed 7 years ago
1
refactor scraper.py in internal-displacement

#33 simonb83 closed 7 years ago
1
some initial nlp exploration with spacy

#32 simonb83 closed 7 years ago
0
Scraper - Detect and tag language

#31 georgerichardson closed 7 years ago
4
appropriately tag broken urls that cannot be downloaded by newspaper

#30 simonb83 closed 7 years ago
0
Revert "deal with case where url doesn't exist"

#29 georgerichardson closed 7 years ago
0
SQL Interface

#28 jlln closed 7 years ago
0
Adding pdf parsing

#27 coldfashioned closed 7 years ago
2
Scraper - Tag content type

#26 georgerichardson opened 7 years ago
4
Scraper - Asynchronous tasks for scraper.py

#25 georgerichardson closed 7 years ago
1
Scraper - Refactor old scraper

#24 georgerichardson closed 7 years ago
0
Merge pull request #1 from Data4Democracy/scraper

#23 jlln closed 7 years ago
1
deal with case where url doesn't exist

#22 simonb83 closed 7 years ago
4
add Data Engineering section to workplan

#21 simonb83 closed 7 years ago
0
Pipeline - consistent date and time

#20 georgerichardson closed 7 years ago
6
Pipeline - save data to csv

#19 georgerichardson closed 7 years ago
5
Explore refugee data in Jupyter Notebooks

#18 georgerichardson closed 7 years ago
4
Scraper - Parsing PDFs?

#17 georgerichardson closed 7 years ago
9
Create, maintain and update user guide / admin guide.

#16 simonb83 closed 7 years ago
0
Build / train classifier for article classification

#15 simonb83 closed 7 years ago
2
Best Machine Learning approach for classifying documents and articles

#14 simonb83 closed 7 years ago
13
Implement filtering of documents not reporting on human mobility

#13 simonb83 closed 7 years ago
0
Update workplan.md with additional detail from Leonardo Milano + sect…

#12 simonb83 closed 7 years ago
0
Issues structure

#11 simonb83 closed 7 years ago
0
Train classifier on training dataset - Utilities for training classifiers

#10 jlln closed 7 years ago
2
Fill out sample_urls function for returning a subsample of urls.

#9 simonb83 closed 7 years ago
0
Get random subsample of URLs from list

#8 georgerichardson closed 7 years ago
5
Train classifier on training dataset

#7 georgerichardson closed 7 years ago
3
Visualization discussion

#6 georgerichardson closed 7 years ago
6
Improve text extraction from URLs with beautifulsoup

#5 georgerichardson closed 7 years ago
13

Previous Next