G-Node / nix

Neuroscience information exchange format
https://readthedocs.org/projects/nixio/
Other
66 stars 36 forks source link

Ordering of sources in EntitiesWithSources #686

Closed jgrewe closed 7 years ago

jgrewe commented 7 years ago

solves issue #683

achilleas-k commented 7 years ago

Wrote a test for it (sources only, HDF5 only) here, if you feel like pulling it in: https://github.com/achilleas-k/nix/commit/abd8ba3d46c041892fe11034186e425de785e91c

jgrewe commented 7 years ago

@achilleas-k Thanks for the test!

gicmo commented 7 years ago

So, I am afraid I was right: I made a quick patch to randomise the list of names in the sources vector (see and maybe pull from here: https://github.com/gicmo/nix/tree/pr/686) and this results in:

Test name: TestEntityWithSourcesHDF5::testSourceOrder
assertion failed
- Expression: sources[idx].name() == set_group.getSource(idx).name()

So if I read issue #683 correctly then this patch does not actually fix that issue (hence my confusion looking at the code yesterday). To properly fix it we have to retain the order of the entities to be added, i.e. the add vector needs to be sorted to match the incoming sources vector. I guess it is obvious but I say it anyway: could be done with a combination of std::sort with a lambda using std::find on the original sources vector. Should be quickly implemented but maybe not super efficient (although the obvious LUT approach might not be faster for the few items we have in the vectors normally).

achilleas-k commented 7 years ago

I think I need to rewind a bit because I lost track of the general idea here.

Correct me if I'm wrong: The vector setter (I'm calling it that now because that's what the corresponding test calls it) replaces all referenced sources. It does this by adding Sources that are in new_arrays but not in old_arrays and removing sources that are in old_arrays but not in new_arrays.

If this is the case, preserving order becomes a bit tricky, right? What's the logical result of the following (simplified) code snippet?

group.sources({A, B, C});
assert(A == group.sources(0));  // at least it should

group.sources({A, D, B, E, C});
assert(D == group.sources(1));  // Is it?
assert(B == group.sources(1));  // or maybe this?

Which of the two last asserts fails? (or I guess, which one should fail?)

I'm not entirely sure what the behaviour of this is right now, but I would expect D to be in the second position there.

UPDATE: After talking with Jan about this for a minute, we both agree that the most reasonable way for this to work is to simply clear the old vector and set the new one in its place. The order of items after that should just be the order of the new vector.

jgrewe commented 7 years ago

@achilleas-k you are right, in the original version we add only those that are not already there and remove those that have been there but are not in the argument vector.

So far, it has been sorted by the ID, thus, we cannot tell in your example. The setter in this pr would sort according to the name of the entity and then

assert(B == group.sources(1));

would be working.

The simplified version above would now simply take the order given by the input vector and

assert(D == group.sources(1));

would be correct. I tend to this solution. It is rather straight forward and may even be faster than all the io traffic for checking who is there...

jgrewe commented 7 years ago

will address this issue later today