Open Ichoran opened 7 years ago
We could implement a mechanism for registering global IDs similar to tinyurl at the time experiments are added to a database under our control (or allow the same experiment to be assigned different IDs since we cannot control the registration process)
I do recall the WCON format was designed to allow for arbitrary snippets of movement data. Do we want to have IDs for those too?
For uniquely identifying experiments, let's add a string "experiment_UUID"
field to the "metadata"
object.
https://en.wikipedia.org/wiki/Universally_unique_identifier
https://stackoverflow.com/questions/4230357/how-do-i-represent-a-guid-in-a-json-object
To the parser, we can add an function to generate a UUID if none is present. Then when the parser writes the WCON object to a file, the UUID will be saved in the metadata.
UUIDs are complicated. Why not provide an ID field and suggest that if you put a UUID in there (generated with standard methods), you'll be guaranteed it's unique? If you put "apple"
in every time, well, results may not be so good. Since it will be an optional field anyway, we can't rely upon it for anything important. And the "snippets" point is a good one too.
Having a way to write a UUID into the metadata on file creation would be great, though!
In the financial world, there is often the need for objects to have two ids:
Here we could have id
for the latter and worm_universal_id
for the former, and the specification could recommend that organizations either populate worm_universal_id
with a UUID or leave it blank and the writer will take care of creating one.
Now since we can have multiple worms specified in the same file, and "id" is used in each track in the data to distinguish them, but is not specified in the metadata object, we'd need a lookup object in the metadata. So in the metadata we would have a "worm_universal_ids" object to specify them:
"worm_universal_ids": {
"1": "616386ea-f500-45a5-a2ef-1fe9b08f7040",
"2": "132b192e-4270-4b8d-99ad-8793a267f3bf"
}
Or perhaps something more elegant?
If one wants a UUID for a worm's ID, why not just use the ID field (since we're making it be string-only anyway)? UUIDs are easy enough to recognize anyway, so it's not like you need a separate field to be able to tell whether there's an UUID or not.
OK. And I'm realizing that this issue is about experiment ids, not worm ids. So perhaps what we really need is:
experiment_id
, a string, and id
and experiment_id
values use UUIDs, andexperiment_id
if one is not presentI'm all for giving people the tools to easily stick UUIDs in, but I think even a recommendation is a little strong. We want people to put the thing that will most help them identify the experiment in the experiment ID field. In case someone doesn't know about UUIDs, it's nice to alert them. Beyond that, I'd leave it to them. Of course, if you want to maintain a database, you may want to have additional restrictions (e.g. all experiments are identified by a UUID) for what goes into a submitted WCON file.
We have all sorts of detail about the experiment, but nothing that helps to identify the experiment. Of course we can't guarantee globally unique experiment IDs, but there is no other field that is obvious for this use. (Possibly a line in
"protocol"
?)We should introduce a standard way to specify this in metadata, possibly "id" (since we already have per-worm IDs). The id should probably be a string.