The GUIDs generated may not be backwards compatible with those generated by
MetaGETA <= 1.2. This is because those GUIDs (which are meant to be
reproducible based on filepath) were based on paths that may not be
constant - such as non-UNC and mixed case filepaths in Windows.
This will be an issue if we want to update existing crawl results. Though
I have implemented a little hack in runcrawler to work around this.
For example, assume U:\ is mapped to \\server\share and V:\ is mapped to
\\server\share\subfolder. U:\subfolder\test.tif is the same file as
V:\test.tif but a different GUID would be generated. Also, GUIDs based on
V:\test.tif are different to those based on v:\test.tif (and other case
variations).
The GUIDs from now on should be constant on Windows as they are "normcased"
and converted to UNC (I don't know enough yet about mount points and links
on *nix to implement something similar).
There may still be an issue with datsets stored on moveable media, etc. I
think we need to look into getting disk IDs.
Original issue reported on code.google.com by pinner.luke@gmail.com on 26 Mar 2010 at 3:54
Original issue reported on code.google.com by
pinner.luke@gmail.com
on 26 Mar 2010 at 3:54