Statements allotted same IDs

LearningLocker / learninglocker

Learning Locker - The Open Source Learning Record Store. Started in 2014.

https://learningpool.com/solutions/learning-record-store-learning-locker/learning-locker-community-overview/

GNU General Public License v3.0

553 stars 276 forks source link

Statements allotted same IDs #684

Closed blumonkey closed 9 years ago

blumonkey commented 9 years ago

I have two AJAX calls that call to a TinCanPHP code to log a statement in the LRS, more or less, back-to-back, If the calls are being done async, both the statements have the same id. Its not the case if the calls are done sync. Can't LL handle concurrent requests? Or did I do anything wrong?

The calls are like:

 ajaxcall1();    
 ajaxcall2();    //both call the same page to log different statements

ryasmi commented 9 years ago

Were the statement's inserted with the same IDs or did LL assign the IDs? If LL assigned the IDs what OS are you using?

blumonkey commented 9 years ago

Were the statement's inserted with the same IDs

I did not understand this. I did not send any Statement ID. Only the Actor,Verb, and Object. I guess that means LL assigned them. I am running Ubuntu 14.04_x64.

fugu13 commented 9 years ago

This is likely because of a very serious bug in Learning Locker: it doesn't generate UUIDs properly. If two statements arrive from the same remote address and have their UUID's generated at the same microsecond (which, since it gets called in a loop for statements received together, will happen quite often), they get the same random seed, which means they get the same identifier. Learning Locker should either generate a random seed at server start then generate all the random bits in a randomized UUID from then on without reseeding (as is normally done for server applications), or even better, use a proper UUID library and either get a random UUID as given or, best, use v1 UUIDs, which are based on timestamp + mac address + incrementing number when timestamp is the same.

fugu13 commented 9 years ago

I see there's an attempt to mitigate the loop problem with generated_ids, which I think works, though at an increasingly expensive cost for any large set of statements (N^2, which should never be acceptable), but it will still occur if two statements are sent separately very close to each other. Also, the UUID generation isn't fully correct, there are reserved bits, but all bits are randomized in this implementation.

fugu13 commented 9 years ago

No, wait, I misread, it has the bits right, the randomization ranges just blended together with them while reading.

fugu13 commented 9 years ago

So just the conflict problem and the N^2 problem.

Hmm, actually not sure of complexity. Base complexity is actually N log N since it is a growing array you're searching, I think (still no reason for having a factor at all, and I'm pretty sure the constant is high), but since it has to retry every time that happens... let me see... a modern processor cycles in less than a nanosecond, so a thousand times a microsecond, and some platforms don't even provide true microsecond precision in their clocks... I'm not sure how long it takes the random calls to get through the CPU, but I'm pretty sure less than 1000 cycles, so you have to do that loop some number of times. Yeah, basically sums up to "very bad".

blumonkey commented 9 years ago

@fugu13 So what is the attempted solution?

fugu13 commented 9 years ago

Huh, actually, it looks like it might not even reach that function? There's also xAPIValidation, that has the same improper use of a PRNG, but is run on each statement independently (if I'm reading right) before the storer ever gets hold of them. That will be even more likely to have the problem.

@blumonkey if you delete the first two lines of https://github.com/LearningLocker/learninglocker/blob/be24d6339b19c9bded4e74825214b19432a0903c/app/locker/statements/xAPIValidation.php#L825 and https://github.com/LearningLocker/learninglocker/blob/05c3feffa589382f14c743a76227f2f987a55497/app/locker/repository/Statement/EloquentStorer.php#L110 , the ones that assign to $remote_addr and call mt_srand, that should suffice as a hotfix.

ryasmi commented 9 years ago

We're working on a fix for this now. As far as I'm aware the method for generating UUIDs in xAPIValidation never gets used. We need to delete some parts of that class.

ryasmi commented 9 years ago

That loop can also be improved since we only actually need the last generated ID not all of them.

fugu13 commented 9 years ago

It just shouldn't be a loop. UUIDv1 will guarantee you don't collide. Heck, if you keep using your UUIDv4 random approach, the probability of a small number (up to, say, millions, though these batches will more commonly be one or two) of randomly generated UUIDs, even using a PRNG, is less worth worrying about than our sun eventually turning into a red giant and all live on earth ending. Unless you re-seed it every time with a seed fully determined by the time, in which case collisions are virtually guaranteed as computer clocks are so imprecise (many operations can be done by a computer in the smallest window it can provide a walllclock value for).

0xdeadbeefc0ffee commented 9 years ago

Hi @blumonkey, this should be fixed by #688 and will be merged into the develop branch soon.