Names cache assumes hash collision is not possible

ColinGilbert / whefs

Automatically exported from code.google.com/p/whefs

Other

2 stars 0 forks source link

Names cache assumes hash collision is not possible #2

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago

The names cache currently does not recognize duplicates. It will throw an
error if an attempt is made to add a duplicate hash key. The
whefs_hashid_list code partially handles duplicates, but dupes are not
properly handled for purposes of searching by name. If there are dupes,
search by name will return the false entry 1/Nth of the time, where N is
the number of collisions. The code is configured to spit out a warning if a
collision is ever found. If that warning ever shows up then i'll put some
effort into solving the problem.

When we have a duplicate name hash, we have to load the full names of all
colliding entries and do a strcmp on them.

i got permission from Christopher Clark today to re-license a derivative of
his hashtable code for use in whefs, so i may (depending on what tests show
the real memory costs are) switch to a real hashtable for the cache.

Original issue reported on code.google.com by sgbeal@googlemail.com on 17 Jun 2009 at 12:35

GoogleCodeExporter commented 9 years ago

Original comment by sgbeal@googlemail.com on 18 Jun 2009 at 11:52

Added labels: Subsystem-whefs, Usability

GoogleCodeExporter commented 9 years ago

On a related note: here's some code for an AVL tree:

http://piumarta.com/software/tree/

which i may be able to adapt to a whio_dev-stored AVL tree. If i can manage 
that, the
hash cache can be reimplemented on top of that. It would save us memory (the 
cost
would now be flat, instead of dependent on whefs_fs::options::inode_count), but 
cost
use some i/o.

Original comment by sgbeal@googlemail.com on 29 Jun 2009 at 3:43

GoogleCodeExporter commented 9 years ago

After hacking on the AVL tree for some time i've come to the conclusion that 
porting
it to an on-disk implementation is going to be much more work than i had hoped, 
and
cost a lot more i/o than i had hoped. So that's on hold for now.

Original comment by sgbeal@googlemail.com on 2 Jul 2009 at 7:50

GoogleCodeExporter commented 9 years ago

Fixing this will be deferred until the whio_epfs/whio_udb port is (eventually) 
done, as that port will inherently fix this problem (whio_udb gracefully 
handles collisions in its hashtable).

Original comment by sgbeal@googlemail.com on 11 May 2010 at 9:16

Changed state: DeferredPendingChanges

GoogleCodeExporter commented 9 years ago

As of today i think i got the last of the naming-related bugs fixed in the 
whio_epfs hashtable-based namer:

http://fossil.wanderinghorse.net/repos/whio/index.cgi/wiki/whio_epfs

Original comment by sgbeal@googlemail.com on 18 Apr 2011 at 6:23