syndicate-storage / syndicate

Internet-scale software-defined storage system
Apache License 2.0
56 stars 10 forks source link

AG entry duplicated entry publishment issue #74

Closed iychoi closed 10 years ago

iychoi commented 10 years ago

AG drivers need existing directory information prior to populate file entries.

iychoi commented 10 years ago

AG needs to test entries before it publishes entires to MS to avoid duplicated entry updates.

jcnelson commented 10 years ago

Take a look at ms_client_get_listings() in libsyndicate/ms-client.cpp. It lets you download a batch of directories from the MS, so you can see if files exist and are as fresh as your cached copies.

Basically, you build up a vector of ms_path_ent structures (using ms_client_make_path_ent()) that represent an absolute path in the filesystem (note: vector is typedef'ed to path_t in ms-client.h). ms_client_get_listings() checks each directory's status on the MS, fetches the listings for each one if they exist, and uses the listings to populate an ms_response_t structure. Take a look at UG/consistency.cpp for how its used in practice (particularly fs_entry_build_ms_path(), fs_entry_ms_path_append(), fs_entry_revalidate_path(), fs_entry_reload_local_path_entries(), and fs_entry_reload_remote_path_entries()).

You should be able to do something similar to query the MS for the current set of entries it knows about, as well as how fresh the MS thinks they are. In fact, this is a large part of what the UG does--it maintains a cached metadata tree in RAM that it keeps consistent with the MS via successive calls to ms_client_get_listings() (via fs_entry_revalidate_path()).

jcnelson commented 10 years ago

I'm working on this right now, as part of refactoring the AG.

jcnelson commented 10 years ago

Fixed in da8c79d5252f656c001ca416723c29271b5e680e