karensg / crowd-summary

Crowd Summary Tool
0 stars 1 forks source link

when uploading a document the text is not saved in the full text field in the documents table #25

Closed timsweep closed 10 years ago

fabcouwer commented 10 years ago

Currently I also get the notification "Automatic summarization failed." when uploading a document in the master branch. @MBrouns, do you have an idea what is causing this?

MBrouns commented 10 years ago

is the fulltext field filled in the database? is the dbPath with which the jar is called set correctly to match the location on your pc?

On Sun, Mar 30, 2014 at 11:51 AM, Friso Abcouwer notifications@github.comwrote:

Currently I also get the notification "Automatic summarization failed." when uploading a document in the master branch. @MBrounshttps://github.com/MBrouns, do you have an idea what is causing this?

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39021629 .

fabcouwer commented 10 years ago

No I meant this is the same issue (fulltext doesn't get put into DB)

bouke-nederstigt commented 10 years ago

@MBrouns @yetti4 Full text should be neatly put in the db. It's just not working for some files. I think it might be sth to do with validation, but I can't really figure out how exactly. You guys got any ideas as to what could be causing this?

fabcouwer commented 10 years ago

Maybe something to do with UTF/ANSI encoding?

bouke-nederstigt commented 10 years ago

Don't think so. It's no problem getting the file contents and/or writing it. It's just a normal string with any file. Some texts just disappear when trying to save the model

MBrouns commented 10 years ago

Is there a max size to a text field in sqlite?

On Sun, Mar 30, 2014 at 5:39 PM, bouke-nederstigt notifications@github.comwrote:

Don't think so. It's no problem getting the file contents and/or writing it. It's just a normal string with any file. Some texts just disappear when trying to save the model

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39028576 .

bouke-nederstigt commented 10 years ago

Nope. Tested with much larger files that where ok. I think it has to do with characters in the document. Like lists etc. I just need to figure out which ones or how to filter them out

Bouke Nederstigt


Oude Delft 223 2611HD Delft

MOB: (+31) 65 34 47 826

2014-03-30 17:54 GMT+02:00 MBrouns notifications@github.com:

Is there a max size to a text field in sqlite?

On Sun, Mar 30, 2014 at 5:39 PM, bouke-nederstigt notifications@github.comwrote:

Don't think so. It's no problem getting the file contents and/or writing it. It's just a normal string with any file. Some texts just disappear when trying to save the model

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39028576> .

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39029020 .

bouke-nederstigt commented 10 years ago

So the fulltext is still not saved sometimes, but for some reason the summary can be generated? I thought this wasn't possible

MBrouns commented 10 years ago

shouldn't be possible. My summarizer gets the text to summarize from the database. it can still summarize when another document's fulltext is empty though

On Mon, Mar 31, 2014 at 11:15 AM, bouke-nederstigt <notifications@github.com

wrote:

So the fulltext is still not saved sometimes, but for some reason the summary can be generated? I thought this wasn't possible

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39068224 .

bouke-nederstigt commented 10 years ago

@MBrouns @yetti4 Problem seems to be solved by using a mySQL db. Fulltext is now saved in a BLOB column. This does however seem to break the summarization tool.

(int) 0 => 'Database connection established',
(int) 1 => 'java.lang.NullPointerException: null'
MBrouns commented 10 years ago

Makes sense, I use the sqlite connector. I'll make a new jar for the production environment. Can't test it myself though so you'll have to do that

On Tue, Apr 1, 2014 at 1:28 PM, bouke-nederstigt notifications@github.comwrote:

@MBrouns https://github.com/MBrouns @yetti4 https://github.com/yetti4Problem seems to be solved by using a mySQL db. Fulltext is now saved in a BLOB column. This does however seem to break the summarization tool.

(int) 0 => 'Database connection established', (int) 1 => 'java.lang.NullPointerException: null'

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39194842 .

MBrouns commented 10 years ago

try the new version. pass extra cl argument "mysql" ath the end and use dbpath in the following way:

jdbc:mysql://localhost/database?"

On Tue, Apr 1, 2014 at 1:34 PM, Matthijs Brouns matthijs.brouns@gmail.comwrote:

Makes sense, I use the sqlite connector. I'll make a new jar SummarizerMYSQL for the production environment. Can't test it myself though so you'll have to do that

On Tue, Apr 1, 2014 at 1:28 PM, bouke-nederstigt <notifications@github.com

wrote:

@MBrouns https://github.com/MBrouns @yetti4 https://github.com/yetti4Problem seems to be solved by using a mySQL db. Fulltext is now saved in a BLOB column. This does however seem to break the summarization tool.

(int) 0 => 'Database connection established', (int) 1 => 'java.lang.NullPointerException: null'

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39194842 .

MBrouns commented 10 years ago

sorry I made a mistake. try the new version again

bouke-nederstigt commented 10 years ago

Not sure I get the cmd right

java -jar C:\websites\crowd-summary\app../summarizers/Summarizer.jar 48 jdbc:mysql://localhost/database?" + "user=root&password=root mysql 2>&1?

bouke-nederstigt commented 10 years ago

(int) 0 => ''password' is not recognized as an internal or external command,', (int) 1 => 'operable program or batch file.'

MBrouns commented 10 years ago

Try:

java -jar C:\websites\crowd-summary\app../summarizers/Summarizer.jar 48 "jdbc:mysql://localhost/database?user=root&password=root" mysql

bouke-nederstigt commented 10 years ago

java.lang.ClassNotFoundException: com.mysql.jdbc.Driver

MBrouns commented 10 years ago

ah ye forgot to add the jar. Try again

On Tue, Apr 1, 2014 at 2:00 PM, bouke-nederstigt notifications@github.comwrote:

java.lang.ClassNotFoundException: com.mysql.jdbc.Driver

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39197148 .

bouke-nederstigt commented 10 years ago

Getting closer

(int) 1 => 'com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[document_id] FROM sentences WHERE document_id = 52' at line 1'
MBrouns commented 10 years ago

try again. and put your changes so it works on mysql in your gitignore so it keeps working for us

bouke-nederstigt commented 10 years ago

Thought I put that in my gitignore already. Almost there.

(int) 1 => 'com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'fulltext FROM documents WHERE id = 53' at line 1'

Could you change fulltext to full_text? fulltext is a reserved keyword in mySQL

MBrouns commented 10 years ago

In the new version I put backticks around fulltext if it uses mysql so you shouldn't need to change anything in your db

bouke-nederstigt commented 10 years ago

Next error

array( (int) 0 => 'Database connection established', (int) 1 => 'Apr 01, 2014 2:25:45 PM edu.stanford.nlp.process.PTBLexer next', (int) 2 => 'WARNING: Untokenizable: ? (U+FFFD, decimal: 65533)', (int) 3 => 'noOfLines: 15', (int) 4 => 'Database insertion complete', (int) 5 => 'Start generating keywords for document', (int) 6 => 'java.lang.NullPointerException: null' )

MBrouns commented 10 years ago

What happens if you use a longtext column instead of blob in mysql

bouke-nederstigt commented 10 years ago

It seems we're finally getting closer to the source of all these troubles. I could only change the column after emptying the table. But at least I now get SQL errors from cake and some signs to go on. I'll see if I can figure out how to sanitize the data. Let me know if you have any ideas.

SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\x93\x09But ...' for column 'fulltext' at row 1

It seems to have to do with encoding of the file after all.

bouke-nederstigt commented 10 years ago

summarizer geeft trouwens de volgende error nu

(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 15',
(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.003602441685296153 => inform',
(int) 5 => '0.003533163960578919 => retriev',
(int) 6 => '0.0027018312639721146 => document',
(int) 7 => '0.00187049856736531 => precis',
(int) 8 => '0.0011084435954757392 => system',
(int) 9 => '9.99340523088544E-4 => retrieval&#34;',
(int) 10 => 'Keywords stored in database',
(int) 11 => 'Database connection in classifier established',
(int) 12 => 'training data created',
(int) 13 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 14 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
bouke-nederstigt commented 10 years ago

@MBrouns I think I got conversion of documents figured out now. The text is correctly added to the db, and also to elasticsearch. I am however still getting the above error (for the weird documents, so it's probably still got to do with that).

May be you can investigate what's causing this error because I can't find any anomalies in the database text anymore. It's currently UTF-8. Anything that can't be converted is just ignored.

bouke-nederstigt commented 10 years ago

Different file (that worked before). Seems to be caused by the fact there were no proper sentences in there. Inputting a "." seemed to solve the problem (and recreate the previous one)

array( (int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1', (int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)', (int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

MBrouns commented 10 years ago

Wait I might know what the problem is. Are there already personal summaries (entries in users_sentences where user_id != 0)

On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt notifications@github.comwrote:

Different file (that worked before)

array(

(int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)', (int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864 .

bouke-nederstigt commented 10 years ago

No. I emptied the db

Bouke Nederstigt


Oude Delft 223 2611HD Delft

MOB: (+31) 65 34 47 826

2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:

Wait I might know what the problem is. Are there already personal summaries (entries in users_sentences where user_id != 0)

On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt notifications@github.comwrote:

Different file (that worked before)

array(

(int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at

org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)', (int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864> .

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847 .

MBrouns commented 10 years ago

then the classifier has no data to train on and will probably fail. Can you add a document manually or reimport a document from the old db?

On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt notifications@github.comwrote:

No. I emptied the db

Bouke Nederstigt


Oude Delft 223 2611HD Delft

MOB: (+31) 65 34 47 826

2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:

Wait I might know what the problem is. Are there already personal summaries (entries in users_sentences where user_id != 0)

On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt notifications@github.comwrote:

Different file (that worked before)

array(

(int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at

org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',

(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864> .

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847> .

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237629 .

bouke-nederstigt commented 10 years ago

Just the users_sentences table? Op 1 apr. 2014 20:03 schreef "MBrouns" notifications@github.com:

then the classifier has no data to train on and will probably fail. Can you add a document manually or reimport a document from the old db?

On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt notifications@github.comwrote:

No. I emptied the db

Bouke Nederstigt


Oude Delft 223 2611HD Delft

MOB: (+31) 65 34 47 826

2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:

Wait I might know what the problem is. Are there already personal summaries (entries in users_sentences where user_id != 0)

On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt notifications@github.comwrote:

Different file (that worked before)

array(

(int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at

org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',

(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHub<

https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864>

.

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847> .

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237629> .

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237824 .

MBrouns commented 10 years ago

The algorithm needs a document that is split in sentences and some of these sentences should be selected in personal summaries (users_sentences)

On Tue, Apr 1, 2014 at 8:07 PM, bouke-nederstigt notifications@github.comwrote:

Just the users_sentences table? Op 1 apr. 2014 20:03 schreef "MBrouns" notifications@github.com:

then the classifier has no data to train on and will probably fail. Can you add a document manually or reimport a document from the old db?

On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt notifications@github.comwrote:

No. I emptied the db

Bouke Nederstigt


Oude Delft 223 2611HD Delft

MOB: (+31) 65 34 47 826

2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:

Wait I might know what the problem is. Are there already personal summaries (entries in users_sentences where user_id != 0)

On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt notifications@github.comwrote:

Different file (that worked before)

array(

(int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at

org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',

(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHub<

https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864>

.

Reply to this email directly or view it on GitHub<

https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847>

.

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237629> .

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237824> .

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39238268 .

MBrouns commented 10 years ago

If you want and if you have phpmyadmin available publicly I could set it up as well

On Tue, Apr 1, 2014 at 8:14 PM, Matthijs Brouns matthijs.brouns@gmail.comwrote:

The algorithm needs a document that is split in sentences and some of these sentences should be selected in personal summaries (users_sentences)

On Tue, Apr 1, 2014 at 8:07 PM, bouke-nederstigt <notifications@github.com

wrote:

Just the users_sentences table? Op 1 apr. 2014 20:03 schreef "MBrouns" notifications@github.com:

then the classifier has no data to train on and will probably fail. Can you add a document manually or reimport a document from the old db?

On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt notifications@github.comwrote:

No. I emptied the db

Bouke Nederstigt


Oude Delft 223 2611HD Delft

MOB: (+31) 65 34 47 826

2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:

Wait I might know what the problem is. Are there already personal summaries (entries in users_sentences where user_id != 0)

On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt notifications@github.comwrote:

Different file (that worked before)

array(

(int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 1',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.14527779480227473 => ametlorem', (int) 5 => 'Keywords stored in database', (int) 6 => 'Database connection in classifier established', (int) 7 => 'training data created', (int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 9 => 'java.lang.ArithmeticException: / by zero', (int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)', (int) 11 => ' at main.Summarizer.main(Summarizer.java:221)', (int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)', (int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)', (int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)', (int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)', (int) 16 => ' at

org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',

(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHub<

https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864

.

Reply to this email directly or view it on GitHub<

https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847>

.

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237629

.

Reply to this email directly or view it on GitHub< https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237824> .

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39238268 .

bouke-nederstigt commented 10 years ago

Haven't got it publicly. vhosts based on ip's needs multiple instances of apache. Anyway it seems you were right. Next error: array( (int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 15', (int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.010981828582962786 => retriev', (int) 5 => '0.005813909249803828 => precis', (int) 6 => '0.002583959666579479 => vector', (int) 7 => '0.0019379697499346095 => object', (int) 8 => '0.0017226397777196528 => query.', (int) 9 => '0.0015073098055046962 => result', (int) 10 => 'Keywords stored in database', (int) 11 => 'Database connection in classifier established', (int) 12 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16510, content=It's also Google's approach to the endeavor--its willingness to let third-party developers deeper into the stack and, potentially, to let users define the experience for themselves--that could help make it a hit., length=32, posInDocument=13, keywordSimilarity=0.0hasNote=false]', (int) 13 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16529, content=', (int) 14 => '', (int) 15 => ' This is where a lightweight user interface is key, and it seems like Google's got a promising foundation, mixing concise, swipe-able cards with optional voice commands., length=27, posInDocument=49, keywordSimilarity=0.0hasNote=false]', (int) 16 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16534, content=These could include urgent notifications, like text messages, that buzz your wrist when they come in, or morsels of data that get silently added to your stack, like scores of sports games., length=32, posInDocument=58, keywordSimilarity=0.0hasNote=false]', (int) 17 => 'training data created', (int) 18 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 19 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

MBrouns commented 10 years ago

that still seems to go wrong in the generating of the classifier. maybe more sentences in ther users_sentences are needed?

On Tue, Apr 1, 2014 at 8:21 PM, bouke-nederstigt notifications@github.comwrote:

Haven't got it publicly. vhosts based on ip's needs multiple instances of apache. Anyway it seems you were right. Next error:

array( (int) 0 => 'Database connection established', (int) 1 => 'noOfLines: 15',

(int) 2 => 'Database insertion complete', (int) 3 => 'Start generating keywords for document', (int) 4 => '0.010981828582962786 => retriev', (int) 5 => '0.005813909249803828 => precis', (int) 6 => '0.002583959666579479 => vector', (int) 7 => '0.0019379697499346095 => object', (int) 8 => '0.0017226397777196528 => query.', (int) 9 => '0.0015073098055046962 => result', (int) 10 => 'Keywords stored in database', (int) 11 => 'Database connection in classifier established', (int) 12 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16510, content=It's also Google's approach to the endeavor--its willingness to let third-party developers deeper into the stack and, potentially, to let users define the experience for themselves--that could help make it a hit., length=32, posInDocument=13, keywordSimilarity=0.0hasNote=false]', (int) 13 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16529, content=', (int) 14 => '', (int) 15 => ' This is where a lightweight user interface is key, and it seems like Google's got a promising foundation, mixing concise, swipe-able cards with optional voice commands., length=27, posInDocument=49, keywordSimilarity=0.0hasNote=false]', (int) 16 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16534, content=These could include urgent notifications, like text messages, that buzz your wrist when they come in, or morsels of data that get silently added to your stack, like scores of sports games., length=32, posInDocument=58, keywordSimilarity=0.0hasNote=false]', (int) 17 => 'training data created', (int) 18 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)', (int) 19 => 'java.lang.ArrayIndexOutOfBoundsException: 1'

Reply to this email directly or view it on GitHubhttps://github.com/yetti4/crowd-summary/issues/25#issuecomment-39239987 .

bouke-nederstigt commented 10 years ago

BOOM! Finally closed this one. Summarization seems to be working and indexing as well. We can start adding training data now.