issues
search
nsaef
/
text_exploration
Tool for analyzing big unstructred collections of digital text documents. Master thesis in Digital Humanities.
3
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Nsaef cleanup
#77
nsaef
closed
5 years ago
0
Files too big for pickle
#76
nsaef
closed
5 years ago
0
Named Entities occuring together are not saved as one entity
#75
nsaef
opened
6 years ago
0
Version identification doesn't work with big collections
#74
nsaef
opened
6 years ago
0
NER doesn't work on big corpora (java heap out of bounds)
#73
nsaef
closed
6 years ago
1
Clusters/Topic Models: Links don't work
#72
nsaef
opened
6 years ago
0
Determine document language
#71
nsaef
closed
6 years ago
0
Collection: Formatanalyse
#70
nsaef
closed
6 years ago
0
Vorkabular-Untersuchung: Reihenfolge prüfen und ändern
#69
nsaef
closed
6 years ago
2
Sent-corpus als Nebenprodukt des tokenisierten Corpus erzeugen?
#68
nsaef
opened
6 years ago
0
Topic Models: Allow custom stop words for the current run/collection
#67
nsaef
opened
6 years ago
0
N-Grams: evaluate methods to find the best one
#66
nsaef
opened
6 years ago
2
Frequencies: implement tf-idf
#65
nsaef
closed
6 years ago
1
Exclude duplicates from processing?
#64
nsaef
opened
6 years ago
0
Celery: show feedback (task running/done + output)
#63
nsaef
opened
6 years ago
0
Word Cloud on collection statistics
#62
nsaef
closed
6 years ago
0
Explanatory texts
#61
nsaef
closed
5 years ago
0
Implement textract on HKI sandbox
#60
nsaef
closed
6 years ago
0
Read additional file formats
#59
nsaef
closed
6 years ago
0
Encoding issues on file upload from server
#58
nsaef
closed
6 years ago
1
Handle locked/encrypted PDFs
#57
nsaef
closed
6 years ago
1
Set up test installation
#56
nsaef
closed
6 years ago
0
Test: Largest-possible file upload
#55
nsaef
closed
6 years ago
2
Test with emails?
#54
nsaef
closed
6 years ago
1
save similar docs and add tf-idf
#53
nsaef
closed
6 years ago
0
"table of contents"/nav links at start of document detail
#52
nsaef
opened
6 years ago
0
similar docs: lower threshold, display in table
#51
nsaef
closed
6 years ago
0
network visualisation of document similarity
#50
nsaef
closed
6 years ago
0
setup script that creates necessary directories etc
#49
nsaef
closed
6 years ago
1
recreate ngrams and NEs
#48
nsaef
closed
6 years ago
0
SImilar docs: default sort by score
#47
nsaef
closed
6 years ago
0
Bi-/Trigrams: Filter/update stopwords
#46
nsaef
closed
6 years ago
1
Most Frequent Words: Implement/update stopwords
#45
nsaef
closed
6 years ago
1
Mark duplicates and versions
#44
nsaef
closed
6 years ago
2
Upgrade NLTK
#43
nsaef
closed
6 years ago
1
Convert pdf and doc(x) to txt
#42
nsaef
closed
6 years ago
0
Visualizations: File Tree
#41
nsaef
opened
6 years ago
0
File upload: handle directories
#40
nsaef
closed
6 years ago
0
update django-celery-results
#39
nsaef
closed
6 years ago
0
Update Model: One-To-Many instead of Many-to-Many
#38
nsaef
closed
6 years ago
2
implement bulk_create
#37
nsaef
closed
6 years ago
1
gensim: major upgrade
#36
nsaef
closed
6 years ago
0
Documents: add note field
#35
nsaef
closed
6 years ago
1
Implement background tasks
#34
nsaef
closed
5 years ago
3
run tasks in background?
#33
nsaef
closed
6 years ago
1
Divide Collection into subpages
#32
nsaef
closed
6 years ago
0
Semantic clusters
#31
nsaef
closed
5 years ago
3
Topic Models
#30
nsaef
closed
5 years ago
3
N-Grams
#29
nsaef
closed
6 years ago
3
About-page
#28
nsaef
opened
6 years ago
0
Next