fabd / kanji-koohii

A web application to help Japanese language learners remember the kanji.
https://kanji.koohii.com
GNU Affero General Public License v3.0
221 stars 21 forks source link

Move site to a new host -- fix performance issues #166

Open fabd opened 5 years ago

fabd commented 5 years ago

This is a discussion topic.

Any advice for moving to a VPS Hosting solution welcome.

Things we need

Battle Plan

Setting Up

Pre-Move

Moving Day

Post Move

fabd commented 5 years ago

Then I have to figure out how to smoothly transition.

I still have to think about any gotchas with SEO and Google.

shawm11 commented 5 years ago

Looking at your Docker files, it seems you are using PHP 7.0 and MySQL 5.6. Although you are using MySQL, you should be able to use MariaDB like you would MySQL. Some hosts are using MariaDB instead of MySQL.

A mysql dump is the safest option for migrating the database. I don't think the size of the .sql file would be a problem. But if you need help migrating, many hosts will help you migrate your database.

About the specs you need, does your current host have somewhere you can get the usage statistics about your resources (CPU, RAM, storage, etc.)? What are the specs of your current host?

38911BytesFree commented 5 years ago

Have you considered AWS? Their free tier (free for one year) gives you a server + database that could probably host the website. Not managed but the price is right :) For a managed service they offer Lightsail that might be worth looking at.

fabd commented 5 years ago

@38911BytesFree Unmanaged is definitely not an option. It's going to be a lot of headaches for me.

Besides Koohii doesn't really need a specific configuration. The default shared hosting php settings are fine. HostGator's shared hosting used to run great. The site used to run pretty good some years ago. It's just all started dwindling down, especially since last year. So my hunch is either aging hardware and/or there are new guests on this server that use a lot of resources.

Either way it's just been downhill since HostGator removed ability to directly file a support ticket with the knowledgable staff. Like now I've been unable to use less with remote shell ! And I couldn't be bothered to spend half an hour with their "live chat" to get that fixed.

I'm excited to finally move to a better host :) But I need something relatively easy to maintain.

fabd commented 5 years ago

I see on AWS Lightsail I could use the LAMP stack. It's not clear whether I would need a separate database instance, though it seems like a good idea?

With Lightsail managed databases, you can easily scale your databases independently of your virtual servers,

fabd commented 5 years ago

@shawm11

About the specs you need, does your current host have somewhere you can get the usage statistics about your resources (CPU, RAM, storage, etc.)? What are the specs of your current host?

They're not bad I think, I managed to get the following a while ago, by using /proc/cpuinfo, free -m, cat /proc/scsi/scsi (I'm not sure if this is 100% correct though). The hardware specs are not otherwise displayed in CPanel, afaik:

* AMD Opteron(tm) Processor 6376 (server processor)
  2 cpus x  16 cores
* 32 GB RAM
* "MEGARAID SAS 9286CV-8E" + 
  2x 240 GB ... OCZ Deneva 2 C M21 MLC Sync 240GB, SATA (D2CSTK251M21-0240)

This is a shared server.

MySQL info according to phpyAdmin:

Server: Localhost via UNIX socket
Server type: Percona Server

mysql> show variables;

    innodb_buffer_pool_size  128 MB
    key_buffer_size            8 MB
    max_heap_table_size      256 MB    ... important pour temp tables in memory
    tmp_table_size           256 MB

This doesn't look bad... this suggests the $ 15 Standard plan in AWS Lightsail would be limiting?

I wonder if it would be better to splurge for the $20 server with 4 GB Memory - 2 Core Processor rather than a lower tier + the separate managed database. As far I understand the AWS Lightsail server + LAMP stck lets you create databases on it. But then the CPU is running threads for php and MySQL.

fabd commented 5 years ago

There is a big downside to those "hands free" servers from what I understand... if it crashes for whatever reason there is nobody but me to get it back up. If I'm away on the weekends the server could just be down for a couple days or more.

Or is there some kind of "auto server reboot" mechanism?

shawm11 commented 5 years ago

They're not bad I think, I managed to get the following a while ago, by using /proc/cpuinfo, free -m, cat /proc/scsi/scsi (I'm not sure if this is 100% correct though). The hardware specs are not otherwise displayed in CPanel, afaik:

* AMD Opteron(tm) Processor 6376 (server processor)
  2 cpus x  16 cores
* 32 GB RAM
* "MEGARAID SAS 9286CV-8E" + 
  2x 240 GB ... OCZ Deneva 2 C M21 MLC Sync 240GB, SATA (D2CSTK251M21-0240)

This is a shared server.

Yeah, this looks like the specs of the machine your site is on. It's anyone's guess to how many sites are on that machine and how much of the resources are going to each site. My guess is that there are at least 32 sites on this machine because of the processor (1 thread per site).

MySQL info according to phpyAdmin:

Server: Localhost via UNIX socket
Server type: Percona Server

mysql> show variables;

    innodb_buffer_pool_size  128 MB
    key_buffer_size            8 MB
    max_heap_table_size      256 MB    ... important pour temp tables in memory
    tmp_table_size           256 MB

This doesn't look bad... this suggests the $ 15 Standard plan in AWS Lightsail would be limiting?

Although the innodb_buffer_pool_size seems small for a busy database that is >1 GB. I think the $15 Standard plan would actually be overkill. I don't think you need a separate database instance right now.

I wonder if it would be better to splurge for the $20 server with 4 GB Memory - 2 Core Processor rather than a lower tier + the separate managed database. As far I understand the AWS Lightsail server + LAMP stck lets you create databases on it. But then the CPU is running threads for php and MySQL.

The $10 server with 2 GB Memory and 1 Core Processor would definitely be an upgrade to what you have now. You can upgrade to the $20 server later.

With modern CPUs, each physical core has at least 2 logical, so it can at least 2 threads. In this case, you don't need to worry about php and mysql running a the same time. But I don't know if a "core" in AWS Lightsail is physical core or a logical core.

38911BytesFree commented 5 years ago

Start with a cheap server and see what happens - you never know with performance, the current db setup looks very conservative so it might run fine (especially if you add some swap space). Upgrading to a larger server will only take a few clicks if you need it.

For autorestart a script can be setup with cron to run every few minutes and check the webserver and db processes are running or restart them if not.

fabd commented 5 years ago

Thank you guys.

Upgrading to a larger server will only take a few clicks if you need it.

Indeed with Lightsail it looks like you can take some kind of snapshot and then restore it on a higher plan. So that's one thing going for it, but I guess most of these services have some kind of upgrade path.

What about other services like Linode, Digital Ocean, any thoughts?

fabd commented 5 years ago

Although the innodb_buffer_pool_size seems small for a busy database that is >1 GB.

@shawm11 Ok you think I should increase this setting on the new server?

fabd commented 5 years ago

I'm actually excited to move to a better server! This weekend I'm with family though but maybe I can start playing with AWS Lightsail on sunday/monday.

shawm11 commented 5 years ago

Although the innodb_buffer_pool_size seems small for a busy database that is >1 GB.

@shawm11 Ok you think I should increase this setting on the new server?

No, you should start with the new server's defaults. Increasing innodb_buffer_pool_size too much may cause more harm than good in performance.

shawm11 commented 5 years ago

Fully managed VPS hosting can get pricey. What's your budget?

fabd commented 5 years ago

Yep that’s the issue :/

I’m just worried now because I found a bunch of threads with people having to restart their AWS Lightsail instances. Now is it that they didn’t configure it properly?

The current host has problems but at least it’s always available once the lag spikes are over. Worst case it’s been unavailable for an hour or two, but then again, it comes back by itself, since it’s managed.

Budget wise, somewhere up to 50 EUR monthly seems reasonable. Thing is with a better server, the JapanesePod affiliation could potentially make up for the increase cost (since on the long run I’m losing support both affiliate and Patreon if the site keeps running poorly).

Regarding. managed, isnt that an advantage of the separate Mysql instance? Those are kept up to date from what i understand, so that reduces the point of failure to the php instance.

I looked for « semi managed » but i dont know what those are, besides the mysql managed instances.

38911BytesFree commented 5 years ago

Yes - according to aws a managed lightsail db will perform common maintenance tasks for you, like patching the underlying database infrastructure and operating system, and upgrading databases between minor versions.

The default setup on lightsail seems to be to leave a process down if it stops running and require manual intervention to restart. The common workaround for this is to add a cron script to poll the processes and restart them automatically if they are not running. I'm sure it could be made bullet proof but may require some work initially.

I've only experience with AWS but your budget looks quite generous so you may be able to find out of the box solutions from other providers that require less setup.

fabd commented 5 years ago

The default setup on lightsail seems to be to leave a process down if it stops running and require manual intervention to restart. The common workaround for this is to add a cron script to poll the processes and restart them automatically if they are not running. I'm sure it could be made bullet proof but may require some work initially.

Ok I understand that part but then wouldn't you need to run the CRON an another computer that is always on? Since presumably I get access to something like a docket container, a virtualized space, and I don't get any reach to the host outside of that that would be able to restart the instance. Or am I misunderstanding? @38911BytesFree

edit: hmm ok or we're talking about processes failing now and then, but the instance itself rarely needs a restart? If we're talking about processes needing a restart now and then, but the instance is still running the CRON scripts then it makes sense. Do you have any examples online of what such a script looks like?

fabd commented 5 years ago

Some very rough stats checking Awstats:

1,000,000+ "Pages" in August 2019 ( Page = html / php request by visitor ) ... 9.79 GB badnwidth ... so roughly 33,333 per day ... let's say 30,000 per day of signed in users ... ... and a typical dynamic php page has 10 SQL queries, so ... ... ... up to 300,000 SQL queries per day

fabd commented 5 years ago

DigitalOcean is just recently adding Managed MySQL.

fabd commented 5 years ago

Something I don't understand is how would a Managed Database connect performance wise? If it's not in the same datacenter than the php instance, wouldn't that add a ton of latency making requests over the network? And how woud I know that it's on the same local network?

38911BytesFree commented 5 years ago

Right - I'm talking about the common case (for me at least) where a bug in the code causes a process to crash. A cron job can be used to restart the process after a few mins to minimize downtime.

https://www.cyberciti.biz/faq/how-to-restart-a-process-out-of-crontab-on-a-linuxunix/ The article also mentions a configuration option in systemd to restart the process and a tool called monit. If run from docker the --restart=always flag will restart the container when it fails so there are lots of options.

In the relatively rare case where the instance itself is in a bad state due to hardware failure aws EC2 has a feature called autoscaling which will detect when a host is down and start a new one but I don't think this is available for lightsail - you could run two instances behind a load balancer if its a concern.

For the managed db, AWS divides the world into regions - as long as you create your host and db in the same region the network connection is fast enough that latency won't be an issue for most applications.

fabd commented 5 years ago

Thanks!

Given this demographics stats, with US and Japan being 37% of the audience ... what do you think would be a good region?

Currently the Koohii shared hosting server is located in Texas.

My hunch is East Coast , or West Coast. I don't know if East Coast is better latency for Europeans. At least, when the HG server was running well, typical response times were 120-150 ish for me in Belgium which seems good.

fabd commented 5 years ago

I'm going to expeirment with the LAMP (PHP 7) stack on AWS right now (image).

fabd commented 5 years ago

Hmm ok after some fiddling even though it will take time, I feel like I should probably install a LAMP stack myself. Otherwise all my scripts are tied to AWS (bitnami) specifically. Like I have to take note of all the paths they use, and how they chose to link all the .conf files, what defaults they went for and whatnot. I'm not sure it's saving me much time, besides looking up how to install phpMyAdmin, OPcache, etc.

If I do it by myself on top of Ubuntu 18.04 , say, then I can create a script like a Dockerfile that could run on a DigitalOcean instance, or Vultr, or AWS Lightsail... to install packages and set up the configs. I'll try that next weekend.

iveskins commented 5 years ago

a script like a Dockerfile ... to install packages and set up the configs. I'll try that next weekend.

What are your feeling about Ansible?

shawm11 commented 5 years ago

If AWS is not working for you, you could try GoDaddy's VPS hosting. Their "Managed" plan seems to be what you are looking for. I've only used GoDaddy's shared hosting for a small PHP+MySQL website for a client, so I don't know how good their VPS hosting is. My experience with GoDaddy's shared hosting has been pretty good. They seem to have servers in many countries.

GoDaddy VPS hosting (US): https://www.godaddy.com/hosting/vps-hosting GoDaddy VPS hosting (Japan): https://jp.godaddy.com/hosting/vps-hosting GoDaddy VPS hosting (Belgium): https://be.godaddy.com/hosting/vps-hosting

fabd commented 5 years ago

@shawm11 Thanks, the "free 1 year ssl" suggests that they don't support Let's Encrypt. But that's an option if I go managed VPS.

@iveskins Seems a little much for me. I just need a record of the setup I did, so a simple bash file with the steps I took to setup the environment -- would save me time if one VPS is unsatisfying, to setup on another. Inn fact a good Dockerfile for apache/php with the security included (like maybe one that people deploy to production?) will give me all the steps I need. I don't want to learn how to deploy a Docker container as such though, otherwise I'll never be done. I have a simple bash script to archive / upload / extract the site and it works well enough.

Maybe I was just overthinking the "manage" part. The initial setup will be time consuming.

For maintenance, it might not be very different than updating my Ubuntu development partition? sudo apt update && sudo apt upgrade. Are there other things I'll need to be wary about?

Some pitfalls I can think of:

shawm11 commented 5 years ago

@shawm11 Thanks, the "free 1 year ssl" suggests that they don't support Let's Encrypt. But that's an option if I go managed VPS.

Yeah, they don't include Let's Encrypt so people buy their SSL certificates. Although, with any VPS Linux host where you have sudo access through SSH, you can install the Let's Encrypt Certbot to automatically renew your Let's Encrypt 90-day certificate. In other word, you can set up Let's Encrypt automatic cert renewal in the initial setup if the VPS does not include it.

@iveskins Seems a little much for me. I just need a record of the setup I did, so a simple bash file with the steps I took to setup the environment -- would save me time if one VPS is unsatisfying, to setup on another. Inn fact a good Dockerfile for apache/php with the security included (like maybe one that people deploy to production?) will give me all the steps I need. I don't want to learn how to deploy a Docker container as such though, otherwise I'll never be done. I have a simple bash script to archive / upload / extract the site and it works well enough.

I think a simple bash script is enough, since the website doesn't seem to have any atypical requirements that require a special environment. The only problem is that you may need to make small changes to the bash script based on the VPS host you use.

For maintenance, it might not be very different than updating my Ubuntu development partition? sudo apt update && sudo apt upgrade. Are there other things I'll need to be wary about?

Are you talking about using a setup from a Dockerfile?

38911BytesFree commented 5 years ago

Getting things setup in docker is very worthwhile, then you are completely VPS provider independent and security updates on the underlying server should be as simple as apt update / upgrade.

There are a lot of tutorials on setting up docker with apache & php so hopefully you should be able to get things up and running fairly quickly. This one looked interesting: https://medium.com/@wemakewaves/migrating-our-php-applications-to-docker-without-sacrificing-performance-1a69d81dcafb

Also AWS WAF could be useful for spam / bot protection: https://docs.aws.amazon.com/solutions/latest/aws-waf-security-automations/capabilities.html

fabd commented 5 years ago

Thanks. I updated the repo some time ago to use docker (ie. php apache Dockerfile). It's much easier for any potential contributor to get the site running, and it sets up a sample database as well. However this Docker setup is for development. It doesn't need to deal with any security, and it also has tools like node that are only for building and not production.

I am not familiar with deploying a container as such. Isn't there a performance overhead from running the production as a docker container? I see this article shows the performance costs, and you have to really know what you are doing to get the container to run close (but not same as) the "bare metal".

What I plan to do is find the steps from various Dockerfiles like the ones mentioned by the article, to setup the VPS instance. I'll just turn it into a shell script should I need to setup another instance. I think figuring out the Docker deployment is a little much right now :) (I mean dealing with dev abnd production docker configurations)... but could be an option later.

Thanks for the AWS link. I wonder if DigitalOcean has something similar.

fabd commented 5 years ago

Latency using managed database: finally found some information in this blog post. Author mentions up to +150 ms for a Managed Database on DigitalOcean in the same region.

I guess I'll go 1 LAMP server first initially, with a 2 vCPU server.. that's cheaper than the Managed Database setups.

If I have time this weekend I'll try to optimize the database first. The stories table is 1GB and I haven't run optimize statement in a long while.

VaclavK commented 5 years ago

was going to comment myself re docker for easy lift and shift and virtual server rather than physical but skimming over the history this has been on the table anyway

unless you are doing something crazy running a container with port open to the outside would result in measurable effect on performance and for availability purposes why not run multiple instances behind LB anyway :D