PokeAPI / pokeapi

The Pokémon API
https://pokeapi.co
BSD 3-Clause "New" or "Revised" License
4.13k stars 931 forks source link

Handing over PokéAPI #350

Closed phalt closed 5 years ago

phalt commented 6 years ago

Hey folks. I have decided to stop supporting PokéAPI myself.

This community has built up enough that it can support itself now and does not need my intervention. However, when production issues occur it does affect me, and I shouldn't be a blocker for it going down. It is frustrating for people using the project for their school work or interviews, and it is frustrating for me receiving angry tweets and emails when I am on holiday and unable to respond.

Therefore I suggest you make a few decisions:

1) decide between yourselves how to move the project away from the current production server. Host it however you chose, wherever you want. 2) nominate someone or some few people to take the https://pokeapi.co domain off my hands - it is paid for another ~10 months but I will not pay for it after that.

This should keep the domain "alive" and the service running.

tdmalone commented 6 years ago

ping @Naramsim @jrubinator @lmerotta (and feel free to ping others you might be aware of who could be interested)

I'm happy to be part of a group that takes this on. I can't do it on my own, but I'm happy to assist.

@Naramsim I think you have mentioned before some details about the current server size etc. used to host PokeAPI - I recall it was around $80pm which looks like it might be the 16 GB droplet? (https://www.digitalocean.com/products/linux-distribution/ubuntu/)

phalt commented 6 years ago

Honestly you can probably get away with hosting it for free, somewhere. The problem is currently the service is built for a 2013 stack - postgres, django, nginx. You don't need all that in 2018. It is just JSON files you want to serve. I think you could find a cheaper and easier-to-maintain way of doing this.

tdmalone commented 6 years ago

That's true... Cloudflare + S3 would probably be close to free. @Naramsim You've generated plain JSON files from this before - is that something you could help spin up in a Docker container, which we could just run (maybe even on Travis) when we need to update? I'd be happy to supply the S3 bucket and work on the CI/CD for it.

lmerotta-zz commented 6 years ago

I'm in to help @tdmalone maintaining the project ! Migrating to a more modern architecture will be time consuming. I can offer to cover the server cost in the meanwhile, and also take ownership of the domain name.

Naramsim commented 6 years ago

Hi guys,

I can definitely help. Soon I will graduate (October 10) and in 7 days I will start a new job, so I will not have time to 'code' till October. I can put 5€ per month or two AWS education packs.

This is what an AWS education pack is:

Student Developer Pack members receive $15 in bonus AWS credits for a total of $50-$115

You can find more details here: https://education.github.com/pack/offers

The project of @sargunv for creating static JSONs is https://github.com/pokesource/ditto. Although I think that its solution for serving these static files should be changed. It uses Flask, but for serving static files I would suggest some Python 3 project as Sanic or Vibora. The best thing could be to just use Nginx with some rewrite rules (or serve everything using AWS Lambda and store everything using S3).

Right now I own a server on Vultr paying 2.99€ per month, I host a simple Python script (through Docker) which is not CPU/memory intensive. If you don't want to go with AWS we can either stay with DO or Vultr which is cheaper.

I also suggest to bring into this conversation @cmmartti which has done an amazing job implementing GraphQL. I see he is willing to get into this project.

I would also suggest taking a deeper look if Cloudflare is really needed or not. From what I saw it caused some problems, although I know that it provides several advantages (HTTPS and zone blocking).

phalt commented 6 years ago

The static server idea is good, and will probably be super cheap if you cache the crap out of it too. Rewriting with nginx will also be incredibly cheap. Basically - just avoid paying for DB costs and you'll be able to use a bunch of static-file-server-type solutions

AlbertoOS commented 6 years ago

Hey guys, I'm also up to help maintaining the PokéAPI. I never took the time to look into the code but I know my way around some python so I think I can help with that and I can help with the server costs a bit too.

cmmartti commented 6 years ago

First, I am also up to being a maintainer. I know the current code back to front.

Like Naramsim said, I have been working on implementing the API in GraphQL, and I was actually about to submit a work in progress PR. I have it working and am using it locally, but I wanted to spin up a demo production server so people can test (that's what I was trying to do now). Due to the nature of GraphQL (custom field selection and pagination) and this API in particluar (complex sorts and filters), it's not something that can be served with static JSON files.

Now the way I see it, there are a few options:

Now, given that I've made the decision to keep the GraphQL API idea alive, is it cheaper to run these as two separate setups, or together? The GraphQL API needs a database anyway, but it would probably still be cheaper to serve static files for v2 REST, and run a separate server with a DB for v3 GraphQL.

If you all think that this GraphQL thing is a good idea and belongs in the project (even if it's split from v2 and moved to its own repo) it would be something else we would need to maintain funding for. I can contribute a small amount, but at the moment I don't have much income. In any case, it's not ready for production yet, and won't be for a little while.

M-Zuber commented 6 years ago

In theory there is no reason that a graphql implementation can't use static files as it's data store

phalt commented 6 years ago

GraphQL is big. If you don't mind supporting it, go for it. But the REST API can and should be rewritten to be much much simpler.

BevilaquaBruno commented 6 years ago

i like and i think is the best idea rewrite the REST API to serve static JSON files, if you choose this option i can help. 😄 something like the solution of https://github.com/pokesource/ditto .

neverendingqs commented 6 years ago

If it's just static files, Netlify might be a good fit. It's free if there is only one maintainer. They may offer it free for more maintainers since this is an open source project, but will have to ask. One maintainer might be okay, since it auto-deploys on merge to master.

Want me to ask Netlify about open source hosting / pricing?

neverendingqs commented 6 years ago

An option for a (temporary) solution:

That would give us some time to slowly migrate over to a pure static files implementation.

Exergist commented 6 years ago

Love PokeAPI (especially in conjunction with PoroCYon's PokeApi.NET). Currently the main PokeAPI portal shows that it "will shut down soon." What is the tentative timeline for PokeAPI going offline?

In the mean time would this be a good candidate for a GoFundMe or similar effort to keep the server up while some kind of transition takes place?

kevindesousa commented 6 years ago

Hello,

I love the project. I have deployed the project on my infrastructure if anyone would use: pokeinfo.org I am installing SSL :)

On the other side, I am writing a new api with Laravel 5.6 LTS and MySQL with dashboard (vuejs) to manage data, if anyone would help me :)

tdmalone commented 6 years ago

@neverendingqs Yes please ask Netlify! I was thinking of doing the same with Cloudflare + S3 but Netlify will be much easier to manage because it does almost everything for you.

@Exergist The max timeline is 10 months (when the domain name expires) but it might be sooner if the Digital Ocean support expires. Re GoFundMe, the main problem is that Paul doesn’t have the time to maintain anymore. The funds needed are not a lot - especially if we convert to static hosting.

@kevindesousa That looks good! I note assets aren’t loading yet and the domain listed on the page is pokeapi.co, though. Where are you hosting? Can you manage the cost associated with all the hits if pokeapi.co was directed to you?

tdmalone commented 6 years ago

@cmmartti Re GraphQL, I think we should keep it separate from this discussion for now - and think about combining it when it’s ready for production. It sounds great, but also has the potential to make the keeping-PokeAPI-running-as-is discussion more complex. What do you think?

tdmalone commented 6 years ago

Summary so far

People willing to help maintain:

Have I covered everyone/everything?

Potential steps from here

I think that’s a decent group of people with a good spread of skills to get us going.

If no-one disagrees, I’d like to suggest we:

Thoughts? Agree/disagree?

kevindesousa commented 6 years ago

@tdmalone My servers are hosted by OVH cloud with Docker stack, so I Can upgrade easily and the cost is already managed by me and have time to help :) We can test the charge to know and add cache if needed. (I don't know the actual charge if pokeapi.co)

If you need help, my skills : devops-php-laravel-vuejs

neverendingqs commented 6 years ago

Emailed:

Hi Netlify,

My name is Mark, and I represent a popular open source project called PokéAPI (https://github.com/PokeAPI/pokeapi), an open RESTful API that has served over 400 million requests since inception. The goal of the project is to provide an API to support learners for educational purposes.

We are in the midst of looking for a new hosting platform, and I brought up Netlify, as it is a platform I already know and trust (and deploy my own sites on). I took a look at the pricing page (https://www.netlify.com/pricing/) and saw a reference about Vue.js and a free open source team plan (https://twitter.com/youyuxi/status/952220999877058561). Does Netlify still offer those, and if so, could we get more details about it, either via email or on https://github.com/PokeAPI/pokeapi/issues/350?

Thanks, Mark

cmmartti commented 6 years ago

@tdmalone Re GraphQL: I fully agree with that. I brought it up just in case it would affect anything.

As far as the proposed plan goes, I agree with that too. I just want to mention that I think it's important that any re-write of the REST API should be backwards compatible with the v2 API (we can probably kill v1 at this point--do we have numbers for how much it's used?).

neverendingqs commented 6 years ago

Update: https://github.com/pokesource/ditto won't work out of the box because Flask is taking the static files and re-mapping the domain name at GET time (and it looks like it only supports HTTP).

The bigger problem though is that the project has no license, so we can't just use / adapt the code unless the author gives us permission. The last commit was in 2016, so there might be a chance the author is still interested and is willing change to a friendly license (e.g. MIT).

sargunv commented 6 years ago

@neverendingqs Hi, I wrote pokesource/ditto and I've been following this discussion from my email. I'd be happy to add a license. I typically use Apache 2.0 on my projects, would that be suitable?

sargunv commented 6 years ago

I've gone ahead and added the Apache 2.0 license to ditto.

neverendingqs commented 6 years ago

W00t! Apache 2.0 would be fine. Just need to add attribution to it in this project, but that shouldn't be a big deal.

Since you're here @sargunv - I think the only change we would need to make is to update the clone script to use a domain name different than from the source (and allowing HTTPS instead of hardcoding HTTP). Did I miss anything?

At this project's side of things, I think all we need to do is to put the JSON in the right place, set the right headers for that directory (Content-Type to be application/json), and de-template all the files in https://github.com/PokeAPI/pokeapi/tree/master/templates to pure HTML.

sargunv commented 6 years ago

@neverendingqs

I recommend putting ditto behind an Nginx reverse proxy (I've used linuxserver/letsencrypt to set that up before with HTTPS). If I recall correctly, passing the right headers through the reverse proxy will change the domain properly. Here's a config I used to use for that, but I don't have much memory of it: https://github.com/pokesource/substitute/blob/master/nginx.conf.

sargunv commented 6 years ago

If I have time this weekend, I'll give resurrecting ditto a shot. I've already got an Nginx install with HTTPS for my personal site, so I'll try putting ditto behind that and see how it goes.

neverendingqs commented 6 years ago

@sargunv - hmmm that's not the static file solution I was thinking of. I will have to defer to others in terms of how much we save if we switch to a nginx solution. We still have to generate the JSON files in CI, and once we have those files, it seems like a smaller step to simply replace all instances of http://localhost with https://pokeapi.co before serving them vs. another switching to another backend.

5punk commented 6 years ago

Can someone share the bandwidth requirement details here? @phalt or anyone with this sort of information.

I've a VPS I'm willing to host this service on, just want to make sure I can meet the min reqs.

sargunv commented 6 years ago

@neverendingqs You mean just generate the json with ditto and serve with another system? In that case, a simple sed on content generated with ditto clone should be sufficient.

tdmalone commented 6 years ago

I haven't used ditto yet but the above sounds perfect - generate and replace the URL. As for actually serving HTTPS, we don't have to worry about that - Netlify will do it out of the box, S3+Cloudfront will do it, and for anything else, there's Cloudflare. No-one need worry about HTTPS in 2018. 😛

@5punk I'm not sure of the bandwidth requirements, but the hosting issues thus far have been the server resources I believe. If we go static and cache the heck out of it, bandwidth should be minimal (or free).

Naramsim commented 6 years ago

Hi, I suggest we continue this discussion on our Slack server. I think we should create a separate channel from the existing ones, don't know if it should be public or only accessible to few of us.

Another alternative to Netlify could be using Now. We can also spin up a Node server there to handle URL rewrites.

elementh commented 6 years ago

Although I understand the need of moving the discussion to the Slack server, please consider updating this thread from time to time.

I'm personally interested in the outcome of the project and this discussion but I don't know if I'd have the time to keep up with the slack discussion.

Thank you! :)

Naramsim commented 6 years ago

Ok, in the meantime I updated the PR which brings the new data from Veekun. https://github.com/PokeAPI/pokeapi/pull/317

BevilaquaBruno commented 6 years ago

I am using Ditto right now and i think we can create a code in python or PHP to change all occurrences of https://localhost to a domain name, we can do this if we use the static json of Ditto.

I have a gist (it's a copy of https://github.com/Darkseal/dir2json) to turn all the folders and all the content of the JSON files into a single JSON, sure this is useless for us now, i think, but it's a example.

tdmalone commented 6 years ago

@Naramsim I thought about Slack too, but in two minds about it. If we do it - gotta be public I think, this is open source after all. But all being in different time zones, Slack can be a challenge - I find GitHub issues work better as they’re asynchronous. Slack you’ve almost gotta be there at the time, or you miss out.

Having said that, I think we’ve found some great ideas here and we just need to drive it into some form of consensus for stepping forward.

cmmartti commented 6 years ago

@Naramsim I would personally prefer it if this discussion was kept here rather than on Slack, for the reasons @tdmalone mentioned.

neverendingqs commented 6 years ago

I don't mind continuing the static files conversation here, but I'm also happy having it as another issue.

For those interested in the static files option, I just want to clarify that the goal is to have no server-side code (e.g. python, php). Instead, we simply generate a file for every possible API endpoint (e.g. https://pokeapi.co/api/v2/pokemon/151 maps to a JSON file sitting on a server somewhere). This means we can do very little to transform the file on its way out.

The reasons for this approach are:

As for actually serving HTTPS, we don't have to worry about that - Netlify will do it out of the box, S3+Cloudfront will do it, and for anything else, there's Cloudflare.

S3 + Cloudfront + Certificate Manager is almost like HTTPS out-of-the-box ;)

Another alternative to Netlify could be using Now. We can also spin up a Node server there to handle URL rewrites.

Now's definitely a candidate, but they don't have a CDN for their OSS plan. Although I don't think we necessarily need one, especially if it's free (as long as we don't hit their bandswidth limit). Again though, if we already have the static files, it seems like a smaller step to just transform + host the static files vs. figuring out hosting for a back-end server.

neverendingqs commented 6 years ago

Using sed post-ditto to transform the JSON files would work, but then we'll be dealing with the file system twice per file (once to download it, and once to fix the domains). It should perform a lot faster if we do it before writing the file to disk, as the entire JSON blob is still in memory.

I have not tested this, but this is an example of how we can do it inside ditto to avoid dealing with the file system twice: https://github.com/pokesource/ditto/pull/3

neverendingqs commented 6 years ago

Potential blocker: it takes a longggggg time to build the docker container. Is it safe to assume it takes that long to get a local build running? Both CircleCi and Travis CI aren't happy with long-running jobs...

CircleCI:

Note: Jobs have a maximum runtime of 5 hours. If your jobs are timing out, consider running some of them in parallel.

Travis CI:

When a job on a public repository takes longer than 50 minutes.

sargunv commented 6 years ago

Once upon a time there was a PR to PokeApi that optimized table creation using bulk inserts. I don't believe it ever got merged, but it claimed a 50% improvement in build time. We could take a look at that and update for master if possible.

neverendingqs commented 6 years ago

@tdmalone - would you be game for creating a GitHub project to split out the discussions here? It might be easier to track everything that way.

neverendingqs commented 6 years ago

Hmm one thing about using static files is that routes like https://pokeapi.co/api/v2/language/ will look different (e.g. it will have to look something like https://pokeapi.co/api/v2/language/index.json). I guess we can always go to a v3...?

sargunv commented 6 years ago

Most http servers can be configured to serve an index file as the default for a directory.

On Sep 6, 2018, at 11:21, Mark Tse notifications@github.com wrote:

Hmm one thing about using static files is that routes like https://pokeapi.co/api/v2/language/ will look different (e.g. it will have to look something like https://pokeapi.co/api/v2/language/index.json). I guess we can always go to a v3...?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

cmmartti commented 6 years ago

@sargunv By coincidence, I am already working on this, though I wasn't aware that there was a closed PR for it. I am using a similar approach to that PR, but using bulk_create in batches of 200 and also setting foreign keys directly (e.g. language_id = int(info[1])) rather than getting the object with that key and assigning it to the model (e.g. language = Language.objects.get(pk = int(info[1]))).

Setting foreign keys directly has the downside that it will fail if that foreign key doesn't already exist, meaning it won't be possible to build just a part of the DB like we can now (so you will only be able to call build_all()), but it's all around a much faster process now so that shouldn't really matter.

Initial tests indicate that it may be as much as a staggering two orders of magnitude faster than individually saving each object. The pokemon_v2_pokemonmove table has more than 500 000 records and it usually took hours to build that table alone, but using the new fast build, it takes about three or four minutes. I'm guessing that the whole DB build will take around 10–15 minutes, once I finish converting it all.

I'll put together a work in progress PR.

cmmartti commented 6 years ago

See #351.

tdmalone commented 6 years ago

@neverendingqs As @sargunv pointed out, I don’t think the .json extension will be a problem. We can either use an index document (easy with S3, I’m sure Netlify makes it easy too), or we can write the files directly with no extension and ensure we set the mime type correctly (S3 all good, again I’m sure Netlify would have a way).

@phalt Do you think you could add the people who have offered to help to the PokeAPI GitHub organisation, and give a few of us admin, so we can get started with using things like Projects? EDIT: List of people in this comment.

sargunv commented 6 years ago

I can also relocate the ditto repo to the org if we're going to be using it for this effort

sargunv commented 6 years ago

I've relocated ditto to PokeAPI/ditto but I'll need @phalt to give me write access to the repo

phalt commented 6 years ago

I've relocated ditto to PokeAPI/ditto but I'll need @phalt to give me write access to the repo

Done. Collaborator team also has write access