Open GoogleCodeExporter opened 9 years ago
Hello,
I suspect this is due to the Danish characters. Unfortunately, I don't really
have
free time at the moment, I hope someone else could fix this one.
Regards,
Vlad
Original comment by halle...@gmail.com
on 12 Jan 2010 at 8:58
SBenjaminP, is this still an issue? If it is, I'll try to index the Danish
Wikipedia
myself and see what happens.
Original comment by asaf.bartov
on 7 Apr 2010 at 6:45
Okay, the issue reproduces on Windows XP too.
It's a problem in Snowball.NET, the stemmer used by Lucene.NET.
Here's the exception, for the record:
System.SystemException was unhandled
Message="System.Reflection.TargetInvocationException: Exception has been thrown by
the target of an invocation. ---> System.ArgumentOutOfRangeException: Index and
length must refer to a location within the string.\r\nParameter name:
length\r\n at
System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length,
Boolean
fAlwaysCopy)\r\n at System.Text.StringBuilder.ToString(Int32 startIndex,
Int32
length)\r\n at SF.Snowball.SnowballProgram.slice_to(StringBuilder s) in
C:\\Asaf\\wikimedia\\bzreader\\Snowball.NET\\SF\\Snowball\\SnowballProgram.cs:li
ne
466\r\n at SF.Snowball.Ext.DanishStemmer.r_undouble() in
C:\\Asaf\\wikimedia\\bzreader\\Snowball.NET\\SF\\Snowball\\Ext\\DanishStemmer.cs
:line
353\r\n at SF.Snowball.Ext.DanishStemmer.Stem() in
C:\\Asaf\\wikimedia\\bzreader\\Snowball.NET\\SF\\Snowball\\Ext\\DanishStemmer.cs
:line
441\r\n --- End of inner exception stack trace ---\r\n at
System.RuntimeMethodHandle._InvokeMethodFast(Object target, Object[] arguments,
SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeTypeHandle
typeOwner)\r\n at System.RuntimeMethodHandle.InvokeMethodFast(Object target,
Object[] arguments, Signature sig, MethodAttributes methodAttributes,
RuntimeTypeHandle typeOwner)\r\n at
System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr,
Binder binder, Object[] parameters, CultureInfo culture, Boolean
skipVisibilityChecks)\r\n at
System.Reflection.RuntimeMethodInfo.Invoke(Object obj,
BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo
culture)\r\n
at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters)\r\n
at
Lucene.Net.Analysis.Snowball.SnowballFilter.Next() in
C:\\Asaf\\wikimedia\\bzreader\\Snowball.NET\\Lucene.Net\\Analysis\\Snowball\\Sno
wball
Filter.cs:line 72"
Source="Snowball.Net"
StackTrace:
at Lucene.Net.Analysis.Snowball.SnowballFilter.Next() in
C:\Asaf\wikimedia\bzreader\Snowball.NET\Lucene.Net\Analysis\Snowball\SnowballFil
ter.c
s:line 76
at Lucene.Net.Index.DocumentWriter.InvertDocument(Document doc) in
C:\Asaf\wikimedia\bzreader\Lucene.Net\Index\DocumentWriter.cs:line 181
at Lucene.Net.Index.DocumentWriter.AddDocument(String segment, Document doc)
in C:\Asaf\wikimedia\bzreader\Lucene.Net\Index\DocumentWriter.cs:line 106
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer analyzer)
in C:\Asaf\wikimedia\bzreader\Lucene.Net\Index\IndexWriter.cs:line 616
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc) in
C:\Asaf\wikimedia\bzreader\Lucene.Net\Index\IndexWriter.cs:line 603
at BzReader.Indexer.TokenizeAndAdd(Object state) in
C:\Asaf\wikimedia\bzreader\BzReader\Indexer.cs:line 584
at System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext,
ContextCallback callback, Object state)
at
System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(_ThreadPool
WaitC
allback tpWaitCallBack)
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)
InnerException:
Some quick thoughts:
1. We should upgrade the bundled Lucene.NET and Snowball.NET. There's a ticket
already open for this, assigned to me. I'll try to find time to make progress
with
this.
2. We should be tolerant of any kind of exception during stemming and indexing,
so
that BzReader itself doesn't crash, even when indexing failed completely.
I'll be looking into it later this week.
Original comment by asaf.bartov
on 7 Apr 2010 at 8:51
looks forward to hear any news. :-)
Original comment by SBenjam...@gmail.com
on 13 Apr 2010 at 5:04
See ticket #10 for upgrading Lucene.Net. Snowball.Net shipping with BzReader is
already the latest one available for .Net.
Original comment by itamar.s...@gmail.com
on 18 Jul 2010 at 10:16
Original issue reported on code.google.com by
SBenjam...@gmail.com
on 11 Jan 2010 at 4:55