issues
search
karlhigley
/
lexrank-summarizer
A Spark-based LexRank extractive summarizer for text documents
MIT License
19
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Use accumulators to quantify boilerplate removal
#44
karlhigley
opened
8 years ago
0
Use GraphX .reverse method to generate bidirectional edges
#43
karlhigley
opened
8 years ago
0
Use Spark's Dataframes API
#42
karlhigley
opened
8 years ago
1
Maintain the order of excerpted sentences
#41
karlhigley
opened
8 years ago
0
Add a link to the LSH paper that explains the pooling trick
#40
karlhigley
closed
8 years ago
0
Update the README to reflect dynamic stopword filtering
#39
karlhigley
closed
8 years ago
0
Remove obsolete option for number of LSH buckets
#38
karlhigley
closed
8 years ago
0
Switch from a stopword list to dynamically identified stopwords
#37
karlhigley
closed
8 years ago
0
Replace explicit removal of zeros with conversion to SparseVector
#36
karlhigley
closed
8 years ago
0
Remove extraneous launch script
#35
karlhigley
closed
8 years ago
0
Stop combining documents with the same identifier during pre-processing
#34
karlhigley
closed
8 years ago
0
Revise description of SRP-LSH boilerplate filtering
#33
karlhigley
closed
8 years ago
0
Add LSH cosine estimation method of graph building to LexRank
#32
karlhigley
closed
8 years ago
0
Compute stopwords from the corpus on the fly
#31
karlhigley
closed
8 years ago
0
Represent similarities as Floats instead of Doubles in LexRank
#30
karlhigley
closed
8 years ago
0
Update to Spark 1.5.0
#29
karlhigley
closed
8 years ago
0
Consolidate input document content by doc ID
#28
karlhigley
closed
8 years ago
0
Rename CosineLSH to SignRandomProjectionLSH
#27
karlhigley
closed
9 years ago
0
Update to Spark 1.4.1
#26
karlhigley
closed
9 years ago
0
Add a configuration option for the number of LSH buckets
#25
karlhigley
closed
9 years ago
0
Cache (LSH signature, feature vector) pairs
#24
karlhigley
closed
9 years ago
0
Disentangle/test similarity computation and Lexrank model
#23
karlhigley
closed
9 years ago
0
Test and refactor featurization code
#22
karlhigley
closed
9 years ago
0
Add basic tests for CosineLSH model
#21
karlhigley
closed
9 years ago
0
Precompute size of sparsified matrices (instead of auto-computation)
#20
karlhigley
closed
9 years ago
0
Improve tokenization to reduce dimensionality
#19
karlhigley
closed
9 years ago
0
Fix broken variable reference from repartitioning code
#18
karlhigley
closed
9 years ago
0
Use minPartitions argument when reading file instead of repartitioning
#17
karlhigley
closed
9 years ago
0
Revert "Combine input entries with the same identifier into a single …
#16
karlhigley
closed
9 years ago
0
Properly parse input lines with text containing tabs
#15
karlhigley
closed
9 years ago
0
Combine input entries with the same identifier into a single document
#14
karlhigley
closed
9 years ago
0
Apply Kryo serialization
#13
karlhigley
closed
9 years ago
0
Adjust spacing and driver class name in README
#12
karlhigley
closed
9 years ago
0
Remove extraneous imports from LexRank model
#11
karlhigley
closed
9 years ago
0
Uses locality-sensitive hashing to group similar sentences into buckets
#10
karlhigley
closed
9 years ago
0
Move featurization to separate class and excerpt selection to Driver
#9
karlhigley
closed
9 years ago
0
Amend script to allow 2g memory for driver and executors
#8
karlhigley
closed
9 years ago
0
Straighten out package name, split into separate files
#7
karlhigley
closed
9 years ago
0
Add corpus-level boilerplate filtering
#6
karlhigley
closed
9 years ago
0
Add a link to the LexRank paper in the README
#5
karlhigley
closed
9 years ago
0
Avoid creating graph edges between different documents
#4
karlhigley
closed
9 years ago
0
Extract featurization into a companion object utility function
#3
karlhigley
closed
9 years ago
0
Fix a typo in the previous edge filtering refactor
#2
karlhigley
closed
9 years ago
0
Pre-filter the graph edges rather than filtering the graph itself
#1
karlhigley
closed
9 years ago
0