Open avis1234 opened 9 years ago
Are the annotations within the classifications the same?
According to @parrish, no. Here are the annotations for three classifications with same timestamp and user, but different subjects and annotations.
[{"lang"=>"en"}, {"sloan_singleband-0"=>"a-2"}, {"sloan_singleband-11"=>"a-1"}]
[{"lang"=>"en"}, {"sloan_singleband-0"=>"a-1"}, {"sloan_singleband-1"=>"a-1"}, {"sloan_singleband-2"=>"a-1"}, {"sloan_singleband-3"=>"a-1"}, {"sloan_singleband-4"=>"a-3"}, {"sloan_singleband-5"=>"a-1"}, {"sloan_singleband-11"=>"a-1"}]
[{"lang"=>"en"}, {"sloan_singleband-0"=>"a-2"}, {"sloan_singleband-11"=>"a-1"}]
Unfortunately, this is unavoidable. When the API receives a classification, it timestamps it immediately. The timestamps you're seeing in the data are set when the classification is created.
Some common scenarios that cause this:
A mobile user, or a user on flaky network connection (very common)
Or in times of unusually high traffic (less common)
The only way to approach this is to have the client timestamp the classifications before they are sent. The caveat here is that there are no guarantees on what the client system clock is set to.
I suppose you could try to calculate a client local time offset by comparing it to a response from the server and adjusting for network latency, but that's pretty far from reliable.
In a nutshell, you could figure out the order that requests are sent in, but not the actual time the request is sent.
Integrating with galaxy_zoo stream. Some events arrive with same user_id, same created_at and different subjects. Logging this issue per our discussion with your team.