rr0hit / gae-livescore

A Google App Engine Application that delivers data from livescore.com to clients in JSON format
GNU General Public License v3.0
25 stars 5 forks source link

appengine datastore write operation quota exceeded #1

Open andrewtryder opened 12 years ago

andrewtryder commented 12 years ago

Hi,

Love the app here, except I'm getting, using only 2 active "competitions" (MLS and Euro 2012) with 3 inactive (EPL/Spain/Italy) datastore write operation exceeding. Have you encountered this?

Here is the usage: http://i.imgur.com/i68Fk.png after about 18hours.

(I'm the only one using/accessing it)

Thanks.

rr0hit commented 12 years ago

Thanks for the interest and bug #1. I know of this issue and is caused because all matches are freshly written into the datastore at specified intervals even if the match is not live at the time. I would suggest considering a more apt cron.yaml[1] configuration specifying what time of the day do you want to look for updates. By default update runs throughout the day.

I plan a major refactoring of the code soon (before the club season in Europe kicks off). I shall try to keep datastore write usage to a minimum. Other changes will include a better scraper and possibly use of NDB instead of datastore itself.

[1] https://developers.google.com/appengine/docs/python/config/cron

andrewtryder commented 12 years ago

Hi,

I looked into how update.py works and it looks like you did it really simple: delete everything and then re-update, which you addressed above.

After I learned a bit more about the quotas and I guess the best way would be: 1.) Using the cron properly as you suggest. Maybe this could be "dictated" by match times? I'm using it mainly for Europe matches, as well, so you could focus mainly on updating once per minute or two between 1300GMT->2100 GMT.

2.) If the match isn't live, it should skip/keep stored. However, the datastore needs some field updates like using a datetime instead of string.

How soon are you planning on doing it? I could see if I can help. I've not done a big appengine project but know enough python.

rr0hit commented 12 years ago

My planned approach for datastore limit problem is:

a) Have 3 tables, one for live matches, one for completed matches and one for scheduled matches. The latter two would be updated at a slower pace. Live matches shall be updated more regularly.

b) Have a unique match identifier based on team names and date. Identify the match that has just started and move the match from scheduled table to live table. Identify the match that has just finished and move it to completed matches.

These two steps should minimize the writes.

I have set a milestone for August first week. I might be able to get some work in second half of July maybe. Meanwhile I have opened up an issue #2 which is fairly easy and requires only python work. Also I did a slight re-branding from "Livescore-API" to "gae-livescore".

andrewtryder commented 12 years ago

a.) Like the idea and it sounds like it will work a bit better. b.) Why not use the "match id" (also could be used to get details on a match like who scored and yellow cards. I don't think the app should store this but maybe have a pass-through method to get details upon call? gae-livescore.appspot.com/matchdetails.json?matchid=214921

Unless it is easy enough to store without the quota.

Would it be useful to use some type of framework to help with datastore/cache?

Something like this:

https://github.com/ocanbascil/PerformanceEngine/

Check issue #2. I added some sample code.