Open asfimport opened 11 years ago
Michael McCandless (@mikemccand) (migrated from JIRA)
+1, this is a good amoeba step!
I think this would be a useful abstraction.
Eg maybe we could write directly to disk ... or, improve the RAM buffering to use growing/appending/paged buffers instead of one massive byte[] (which causes huge RAM spikes when we do ArrayUtil.grow) ... actually once we fix RAMFile it could just use that.
Robert Muir (@rmuir) (migrated from JIRA)
I actually noticed the spike stuff in finish() too.
because thats where we currently take the whole grow()'ed byte[] used during construction and shrink it to the actual necessary size we need. We are doing this anyway, so we could just use something else for intermediate buffering instead.
One confusing thing is that FST is like an immutable concept from the outside, but from the code on the inside its mutable. I really wish the buffering and stuff was instead encapsulated in Builder or somewhere else so that FST was simpler and immutable.
Michael McCandless (@mikemccand) (migrated from JIRA)
I really wish the buffering and stuff was instead encapsulated in Builder or somewhere else so that FST was simpler and immutable.
+1
We now use the same class for writing as for reading, which is very confusing.
Commit Tag Bot (migrated from JIRA)
[trunk commit] Robert Muir http://svn.apache.org/viewvc?view=revision&revision=1420014
LUCENE-4593: first step towards FST storage abstraction
Commit Tag Bot (migrated from JIRA)
[branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revision&revision=1420017
LUCENE-4593: first step towards FST storage abstraction
Commit Tag Bot (migrated from JIRA)
[branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1430334
LUCENE-4593: clean up how FST saves/loads the empty string output
Commit Tag Bot (migrated from JIRA)
[trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1430333
LUCENE-4593: clean up how FST saves/loads the empty string output
Commit Tag Bot (migrated from JIRA)
[branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1430342
LUCENE-4593: remove bogus true ||
Commit Tag Bot (migrated from JIRA)
[trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1430341
LUCENE-4593: remove bogus true ||
Michael McCandless (@mikemccand) (migrated from JIRA)
Minor improvements, but an API change for the uber-Builder-ctor (the API is experimental): I changed allowArrayArcs from setter to ctor param (it doesn't make sense to change this while you are building).
Also added comment for lastFrozenNode ...
Robert Muir (@rmuir) (migrated from JIRA)
Nuke this setter!
Commit Tag Bot (migrated from JIRA)
[trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1430477
LUCENE-4593: move allowArrayArcs to ctor
Commit Tag Bot (migrated from JIRA)
[branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1430480
LUCENE-4593: move allowArrayArcs to ctor
I was looking at James patch for #4371, and I thought that you know, FST almost abstracts its underlying "i/o" (storage) via reader/writer abstractions.
It would be good to try to work on this more, e.g. we can imagine a little abstraction like lucene has a Store (Directory).
This way maybe we could cleanup the packed vs non-packed, allow for > 2GB fsts without slowing down small ones, and so on.
I have a patch that is like an amoeba-step towards this
Migrated from LUCENE-4593 by Robert Muir (@rmuir), updated Jan 08 2013 Attachments: LUCENE-4593.patch (versions: 2) Sub-tasks:
5682