Open ConnorFoody opened 9 years ago
Sam did the client side throttle, someone still needs to do the server side throttle (this may be more difficult)
Someone will need to add a "get target" to the schedulable
Could we do a multi-pass greedy algo?
first pass: needed density = approx num articles from rss / unit time num articles, time range of articles
2nd pass: make everything standard ie {1,2,3,4} with time frame 10 --> {1, 4, 7, 10}
3rd pass clump and randomize according to some "humanness" stat. {1, 4, 7, 10} --> {1, 3, 4, 10}
Not sure how good a result we could get here, or how we would "fix" a bad stat
Issue also mentioned in scraper here.
We should make request timing appear more human and make sure we don't ping a site too much. We can do both by having the scheduler modify the schedules and take the provided times as suggested priorities. Potential example of schedule change:
Should be separated out from the actual scheduler.