yetanalytics / lrsql

A SQL-based Learning Record Store
https://www.sqllrs.com
Apache License 2.0
91 stars 17 forks source link

How to GET only the newest records? #380

Open scizmeli opened 8 months ago

scizmeli commented 8 months ago

Hello,

I am trying to retrieve statements incrementally.

In the first request our data analysis service gets ALL the records. After that I would like to be able to retrieve only the newest records that arrived since the last retrieval. I construct the URL in the following way :

statements?limit=50&from=279e08f4-9ced-4dd9-a101-18c261bdadc2

where the UUID belongs to the record that has the timestamp that is the one before the newest (so I expect to GET only 1 record). But I again GET 50 records instead of just 1.

I then used the since parameter https://.../xapi/statements?limit=50&from=2024-03-08T13:47:39+00:00

where the timestamp used is the one before the newest.

This gives me http 500 Error.

Where am I doing wrong? how do I achieve incremental retrievals with only the newest records? thanks

kelvinqian00 commented 8 months ago

Hello @scizmeli,

Thanks for raising these issues. For your first issue, you need to set ascending=true. That is because if ascending=false, which is the default, you start from the latest statement and work your way backwards in time, which is the exact opposite of what you want.

As for your second URL, you used from= where you should have used since=; from is always a statement ID while since is always a timestamp. However, the fact that you got a 500 Server Error instead of a 400 Bad Request is concerning; if you don't mind, could you please share the stack trace?

scizmeli commented 8 months ago

Thank you @kelvinqian00 for your fast response. My confusion arose from the fact that I was expecting the returned records to be sorted by timestamp. But now I see they are indeed sorted by stored... and the since parameter applies to the field stored. Now I get it. My bad not reading the documentation more carefully.

But now another source of confusion is here. There are sometimes big differences between values of timestamp and stored. When this difference is about 1 second, it perfectly makes sense. But in some situations, it is several hours... How do we explain this?

image

I will open another ticket to follow up on the 500 error.

kelvinqian00 commented 8 months ago

@scizmeli timestamp refers to the time when the statement is created (e.g. by the Learning Record Provider), while stored refers to the time when the statement is inserted into the LRS and is auto-generated by the LRS. From the timestamp section of the xAPI spec:

The "timestamp property" in a Statement can differ from the "stored" property (the time at which the Statement is stored). Namely, there can be delays between the occurrence of the experience and the reception of the corresponding Statement by the LRS.

kelvinqian00 commented 8 months ago

@scizmeli The 500 error that you got from the invalid from parameter has been addressed in v0.7.9. Now, using an invalid from parameter will result in a 400 Bad Request response instead.

kelvinqian00 commented 8 months ago

But now another source of confusion is here. There are sometimes big differences between values of timestamp and stored. When this difference is about 1 second, it perfectly makes sense. But in some situations, it is several hours... How do we explain this?

image

@scizmeli So upon further inspection of your screenshot depicting timestamp and stored values, there does seem to be inconsistencies that I had overlooked in my original response, so apologies for that. Do you know how your LRP sends statements? Is there a buffer or a queue in place that may create delays for posting?