reisepass ETHz_HeadlineGenerator issues

reisepass / ETHz_HeadlineGenerator

NLP 2013 INF.ETHz

2 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

ArticleTopicNGramSum and MostProbSentBasedOnTopicDocProb used to have a constructor which without specifying an TreeMap corpus. This was based on test data i made inside the class. We must change all these instantations to also pass a the corpus as a T

#21 reisepass opened 11 years ago
0
Out of bounds in FeatureBasedSummary.getTopEntity()

#20 reisepass opened 11 years ago
2
index out of bounds error in DocNGramSimple on line 59 ngramWords[i] = words[i]; See below

#19 reisepass opened 11 years ago
8
getTopicDocProb has a null pointer exception sometimes. I suggest we let it return zero when it cant find topic - doc probability . See comment

#18 reisepass opened 11 years ago
1
Regex not doing what we want when parsing News200 corp

#17 reisepass opened 11 years ago
0
Change the most likely ngram sentence summary to use Treemap.lowerENtry( key==WildCard+query) instead of Collections.Sort

#16 reisepass opened 11 years ago
0
Change all the Comparator implementations for the ngram TreeMap<ArrayList<String>,double> stuff so that they ignore Lower Case upper Case, Punctuation, white space

#15 reisepass opened 11 years ago
0
//TODO Append all the ngrams from the original document into the outNgram. Add frequencies if an ngram alreayd excists. Weigh down the ngrams from the query doc because its frequencies are not weighted by that probability of topic to doc which all the ngr

#14 reisepass opened 11 years ago
0
Hmmm it looks like Doc.annotation is not necessarily set. Maybe we should change it so that if you do doc.getAno() it checks if the annotation is null and if it is then it does all the stanford NLP stuff

#13 reisepass opened 11 years ago
0
Import an external corpus similar to our own. Derive a set of categories, using ( LSI | NMF | LDA). And create a method which does not change the categories but just classifies a new article to one of them.

#12 reisepass opened 11 years ago
0
Create a corssvalidation screipt which runs through all Summarizers and over all their parameters, executing the rouge script each time. Store these values in a sorted list for optimization.

#11 reisepass opened 11 years ago
0
IN summary method 2 NeFreq and NounFreq include information about how offten two Ne or an Ne and a Noun occure together in a sentence.

#10 reisepass opened 11 years ago
0
Create new summarizer which is like the NeFreq one but also considers non NE in the list of most used words.

#9 reisepass opened 11 years ago
0
Adjust the NE count functions to include prepositions which refer to the NE. Use this: http://nlp.stanford.edu/software/dcoref.shtml

#8 reisepass opened 11 years ago
0
Extractive summarizer based on NE frequency

#7 reisepass closed 11 years ago
0
Implement Significant phrase annotation

#6 reisepass closed 11 years ago
0
Configuration

#5 jarednieder closed 11 years ago
1
Clauses

#4 jarednieder opened 11 years ago
0
Sentence Trimming

#3 jarednieder opened 11 years ago
1
Implement Summarizer : Naive, First Sentence

#2 reisepass opened 11 years ago
0
Implement Summarizer : Naive, First Sentence

#1 reisepass opened 11 years ago
0