Open GoogleCodeExporter opened 9 years ago
I've failed to reproduce this - too many problems with classpaths on my Windows
box :(
Is it actively hampering development, or is it mainly good to know that this
happens?
Original comment by dwidd...@gmail.com
on 14 Oct 2013 at 6:20
I think that the issue is quite crucial because we obtain different vectors
just changing the documents indexing order.
You can reproduce the problem also using a single OS.
Look for org.apache.lucene.demo.IndexFiles class, than for
indexDocs(IndexWriter writer, File file) method.
In this method just entry something like this to force a certain indexing order:
if (file.isDirectory()) {
String[] files = file.list();
// to index the document in a natural order
Arrays.sort(files);
...
}
Than run it within your linux OS and you will obtain a different vector.bin
files for the same corpus.
Hope it is clear, for any question just let me know
Massimo
Original comment by massimo....@gmail.com
on 15 Oct 2013 at 9:44
I've marked this a low priority, I'm afraid. While it's possible to create
different indexes on different OS's or by enforcing different orderings, it's
not clear that this is a blocking issue. As far as I know, we don't have user
groups who are forced to try and build a single index on a heterogeneous OS
platform.
So yes, you can build different vectors - but I don't think anyone is forced
to. Am I mistaken on this?
Original comment by dwidd...@gmail.com
on 28 Oct 2013 at 5:23
Original issue reported on code.google.com by
massimo....@gmail.com
on 4 Oct 2013 at 9:49