Closed binarykitchen closed 2 years ago
that might be some of the cause of #486 yesterday? Can you go easy on it. We rely a lot on free hosting at the university.
sure, don't want to cause any troubles.
and with unusual high traffic i rather meant lots of requests from a single IP address, above average. something like 10 GET requests per minute for one hour long. any good server should be able to cope with that little stress i think. if not, then this will happen sooner or later anyway. and honestly i think #486 might have a different cause because at that time my automated tool wasn't running yet.
I agree we could/should document the api publicly.
I have discussed this with Dave Moskovitz who is responsible for the API. While there is some existing API documentation, this needs to be tidied up before it's released publicly. We also need to ensure that the API would not be at risk of overloading from lots of simultaneous requests / denial-of-service attacks. We are also testing to make sure that the API only gives access to data that we're happy to share publicly. We will let you know once that is all in place.
We could serve the API on the rails app instead. It would be a subset of the freelex functionality, but it's very easy to spin up a modern json api on some of the data, and i can use some automated tools to document it so we're not manually maintaining the documentation.
@Br3nda what are the rate-limiting features on Heroku? I'd really like to have something in place to prevent denial of service attacks before the API is shared more publicly. But serving the API on the rails app would be good.
There is no rate limiting in place currently @DSRU. Sadly, Heroku doesn't do any magic for us in this respect either. I'm not sure there is any action we can take here except take on board that having an API is a requested feature. Given that this issue isn't actionable as is, I would like to close it. What say you @Br3nda @DSRU .
@eoinkelly can we put this in the icebox rather than closing altogether?
Is there an API? I'm interested in using this in a project I'm working on, and using an API would definitely be better than parsing HTML. Thanks.
@Oj18 see discussion above. We have logged the request for an API but at the moment have no resources available to pursue this.
Hey all, I see this feature has been iceboxed - main reason being concerns around rate-limiting and ddos protection? Just wondering if you ever considered setting up an AWS API gateway, that will allow you to rate limit and I also believe you can generate some OpenAPI documentation from it, though I've never played with it myself? The api gateway could just serve as a proxy to your real api.
@cryosis7 the dictionary is actually just a thin skin over another backend that is used by the DSRU to manage the dictionary content. We're in the process of switching from that service, which is at least 15 years old, to a new(er) editorial tool based on https://github.com/Signbank/FinSL-signbank. That service has some support for getting data out, but it's fairly limited and mostly oriented to accessing sign data in annotation software for dictionary updates.
I know that it's not an API, but did you know that dictionary data is available as releases on https://github.com/odnzsl/nzsl-dictionary-scripts? This is the same data that the iOS and Android apps use, and is updated by Github actions quarterly (in theory, I'm still trying to tweak the cron statement in the Github workflow to have this work on the correct schedule). It doesn't include the videos, but the database that is exported does include public URLs to the videos, which you could chuck in a txt file for curl to download.
Anyway, for now, I'm going to close this issue, because I think it's disingenuous to leave it open, and it's way more likely that if we add API access, it'd be to the Signbank project rather than this app. For now, I'm hoping that providing exported data is sufficient for most needs given the dictionary content updates are mostly incremental. We'll be updating these releases soon as well, since they currently pull from the old editorial service that is being shut down, and we'll likely be exporting even more data at this point since the plan is to update this application to use a local database that uses the exported data, just like the native apps do. If there's any information missing from that exported data that would be useful, feel free to open up an issue either on the signbank or scripts repo with your request and we'll see if we can incorporate it into the new exports.
maybe you have noticed an unusual high number of traffic right now? this because i'm writing a tool to automatically grab videos by crawling search results over html ... which is not very ideal for both of us.
so yeah, wondering if you have an API and it's documented? for example a GET query to get me the link to a video by any given search term.
thanks :)