Closed blumonkey closed 9 years ago
Were the statement's inserted with the same IDs or did LL assign the IDs? If LL assigned the IDs what OS are you using?
Were the statement's inserted with the same IDs
I did not understand this. I did not send any Statement ID. Only the Actor,Verb, and Object. I guess that means LL assigned them. I am running Ubuntu 14.04_x64
.
This is likely because of a very serious bug in Learning Locker: it doesn't generate UUIDs properly. If two statements arrive from the same remote address and have their UUID's generated at the same microsecond (which, since it gets called in a loop for statements received together, will happen quite often), they get the same random seed, which means they get the same identifier. Learning Locker should either generate a random seed at server start then generate all the random bits in a randomized UUID from then on without reseeding (as is normally done for server applications), or even better, use a proper UUID library and either get a random UUID as given or, best, use v1 UUIDs, which are based on timestamp + mac address + incrementing number when timestamp is the same.
I see there's an attempt to mitigate the loop problem with generated_ids, which I think works, though at an increasingly expensive cost for any large set of statements (N^2, which should never be acceptable), but it will still occur if two statements are sent separately very close to each other. Also, the UUID generation isn't fully correct, there are reserved bits, but all bits are randomized in this implementation.
No, wait, I misread, it has the bits right, the randomization ranges just blended together with them while reading.
So just the conflict problem and the N^2 problem.
Hmm, actually not sure of complexity. Base complexity is actually N log N since it is a growing array you're searching, I think (still no reason for having a factor at all, and I'm pretty sure the constant is high), but since it has to retry every time that happens... let me see... a modern processor cycles in less than a nanosecond, so a thousand times a microsecond, and some platforms don't even provide true microsecond precision in their clocks... I'm not sure how long it takes the random calls to get through the CPU, but I'm pretty sure less than 1000 cycles, so you have to do that loop some number of times. Yeah, basically sums up to "very bad".
@fugu13 So what is the attempted solution?
Huh, actually, it looks like it might not even reach that function? There's also xAPIValidation, that has the same improper use of a PRNG, but is run on each statement independently (if I'm reading right) before the storer ever gets hold of them. That will be even more likely to have the problem.
@blumonkey if you delete the first two lines of https://github.com/LearningLocker/learninglocker/blob/be24d6339b19c9bded4e74825214b19432a0903c/app/locker/statements/xAPIValidation.php#L825 and https://github.com/LearningLocker/learninglocker/blob/05c3feffa589382f14c743a76227f2f987a55497/app/locker/repository/Statement/EloquentStorer.php#L110 , the ones that assign to $remote_addr and call mt_srand, that should suffice as a hotfix.
We're working on a fix for this now. As far as I'm aware the method for generating UUIDs in xAPIValidation never gets used. We need to delete some parts of that class.
That loop can also be improved since we only actually need the last generated ID not all of them.
It just shouldn't be a loop. UUIDv1 will guarantee you don't collide. Heck, if you keep using your UUIDv4 random approach, the probability of a small number (up to, say, millions, though these batches will more commonly be one or two) of randomly generated UUIDs, even using a PRNG, is less worth worrying about than our sun eventually turning into a red giant and all live on earth ending. Unless you re-seed it every time with a seed fully determined by the time, in which case collisions are virtually guaranteed as computer clocks are so imprecise (many operations can be done by a computer in the smallest window it can provide a walllclock value for).
Hi @blumonkey, this should be fixed by #688 and will be merged into the develop branch soon.
I have two AJAX calls that call to a TinCanPHP code to log a statement in the LRS, more or less, back-to-back, If the calls are being done async, both the statements have the same id. Its not the case if the calls are done sync. Can't LL handle concurrent requests? Or did I do anything wrong?
The calls are like: