Application Crashes In Production

Rogibb111 commented 1 year ago

I am getting the following error on fly.io when trying to run the scraping functionality:

java.net.SocketTimeoutException: Read timeout
   at java.io.FilterInputStream.read(FilterInputStream.java:107)ableInputStream.java:58)
   at java.io.FilterInputStream.read(FilterInputStream.java:107)
   at org.jsoup.internal.ConstrainableInputStream.readToByteBuffer(ConstrainableInputStream.java:8
   at org.jsoup.helper.DataUtil.readToByteBuffer(DataUtil.java:202)
   at org.jsoup.helper.DataUtil.parseInputStream(DataUtil.java:107)
   at org.jsoup.helper.HttpConnection$Response.parse(HttpConnection.java:835)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.doc$lzycompute$1(JsoupBrowser.scala:76)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.doc$1(JsoupBrowser.scala:76)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.processResponse(JsoupBrowser.scala:78)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.$anonfun$executePipeline$4(JsoupBrowser.s
   at scala.Function1.$anonfun$andThen$1(Function1.scala:85)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.get(JsoupBrowser.scala:35)
   at scrapers.BaseScraper.<init>(BaseScraper.scala:42)r.get(JsoupBrowser.scala:29)
   at scrapers.BaseScraper.<init>(BaseScraper.scala:42)
   at scrapers.ScraperFactory$.initializeScraper(BaseScra
   at scrapers.ScraperFactory$.initializeScraper(BaseScra

   at controllers.HomeController.$anonfun$generateFuture$1(HomeController.scala:67)
   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
   at scala.concurrent.impl.Promise$Transformation.run(Promise.sc
   at scala.concurrent.impl.Promise$Transformation.run(Promise.sc

   at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
()
java.net.SocketTimeoutException: Read timeout
   at org.jsoup.internal.ConstrainableInputStream.read(ConstrainableInputStream.java:58)
   at java.io.FilterInputStream.read(FilterInputStream.java:107)
   at org.jsoup.internal.ConstrainableInputStream.readToByteBuffer(ConstrainableInputStream.java:8
   at org.jsoup.helper.DataUtil.readToByteBuffer(DataUtil.java:202)
   at org.jsoup.helper.DataUtil.parseInputStream(DataUtil.java:107)
   at org.jsoup.helper.HttpConnection$Response.parse(HttpConnection.java:835)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.doc$lzycompute$1(JsoupBrowser.scala:76)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.doc$1(JsoupBrowser.scala:76)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.processResponse(JsoupBrowser.scala:78)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.$anonfun$executePipeline$4(JsoupBrowser.s
   at scala.Function1.$anonfun$andThen$1(Function1.scala:85)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.get(JsoupBrowser.scala:35)
   at net.ruippeixotog.scalascraper.browser.JsoupBrowser.get(JsoupBrowser.scala:29)
   at scrapers.BaseScraper.<init>(BaseScraper.scala:42)
   at scrapers.PowdrScraper.<init>(PowdrScraper.scala:12)
   at scrapers.ScraperFactory$.initializeScraper(BaseScraper.scala:28)
   at controllers.HomeController.$anonfun$generateFuture$1(HomeController.scala:67)       
   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
   at scrapers.BaseScraper.<init>(BaseScraper.scala:42)
   at scrapers.WinterParkScraper.<init>(WinterParkScraper.scala:11)
   at scrapers.ScraperFactory$.initializeScraper(BaseScraper.scala:29)
   at controllers.HomeController.$anonfun$generateFuture$1(HomeController.scala:67)
   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
   at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:678)
   at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:467)
   at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
   at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
   at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
   at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
   at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
()

It looks like this is due to the application taking up too much memory on the production server hosted on Fly.io. If you look at the memory utilization graph here, you can see that the app takes up 215 MB at baseline, while the VM's limit is 221 MB. Anytime the scraping job is run, the memory usage spike put the application over the VM's limit and the OOM reaper kills the process.

Rogibb111 commented 1 year ago

Research Findings

I downloaded VisualVM and Eclipse MAT and did some poking around at the JVM instance for the Ski Resort Dashboard. The leak analyzer in MAT didn't really find anything noteworthy (1 small potential leak from some library in my stack). The funny thing is that when I run the application in production (not using sbt run), the memory footprint on my local machine is much much smaller. The baseline on my local machine when just running the application was anywhere between 130 and 150 MB. I don't really know what is causing the extra memory consumption on the server, but from poking around with ps on the server console, it looks like the JVM instance is taking up almost all of the memory available to the machine.

Rogibb111 commented 1 year ago

Action Steps

I am not entirely sure of all of the differences between the JVM instance of my application on my local machine vs the instance on the production server, but one major one I can think of is that the container image that gets pushed on to the server VM is being build by a Heroku Buildpack. I'm going to try to build my own docker image instead of using Heroku Buildpacks to do it for me. This will require me adding a DockerFile to my repo, and probably a .dockerIgnore as well. I'll test that my DockerFile is correct and building a working image by running it on my Work laptop which has docker installed.

Rogibb111 commented 1 year ago

Action Results

Well, i created a Dockerfile for the ski-resort-dashboard project and got it working with Fly.io. Although it did save me about 30 to 40 mb in operating memory for the project, that still was not enough savings to keep the project from running out of memory when the scraping job started to boot up.

Research Findings

Because of this, I started looking for alternative hosting methods for the project. Unfortunately, there were no other solutions like Fly.io or Heroku where using a simple cmd tool like heroku-cli or flyctl would magically get the project running on a pre-configured server. Other free solutions offered by the likes of Google or Amazon would basically have me rebuilding the entire project into Lambdas. There were other solutions like Fly.io and Heroku, but they weren't free. I found Oracle's Cloud product offering, which does have a free tier. However, they essentially offer two VM machines with 1 GB of memory and 50 GB of storage as well as 2 50 GB attachable volumes and a network configuration. Decided to use this along with Dokku running on one VM as the way to host the website.

Rogibb111 commented 1 year ago

Re-opening due to issues with dokku github action

Rogibb111 commented 1 year ago

Re-opening because my last PR commit to fix this issue didn't have the correct commit title, so the release pipeline did not release a new version. Because of this, the deploy pipeline never ran so I don't know if the changes worked or not. Creating a new PR with some small changes to the readme to kick off the deploy pipeline this time.

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.11 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Re-opening again due to the push to tag event not being triggered with the release workflow is run. Going to use the workflow_run event instead to try and get the deploy workflow to trigger instead.

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.12 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Re-opening again due to the Deploy workflow failing. Looks like the git_remote_url isn't correct so adding yet another PR to fix that.

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.13 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Re-opening again due to the docker image URL not being formatted correctly.

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.14 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Re-opening again due to Dokku being unauthorized to pull the Docker image from ghcr.io

Rogibb111 commented 1 year ago

Re-opening due to fixing allowed domains

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.16 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Re-opening again because docker image URL is the same for every deploy. I need to attach the digest to the end of the image URL so that Dokku will know to pull the updated image.

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.17 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Re-opening because trigger on push to tags doesn't work. Trying on release instead.

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.18 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Rogibb111 commented 1 year ago

Reopening because I forgot to remove the check from the workflow_run trigger

Rogibb111 commented 1 year ago

:tada: This issue has been resolved in version 1.1.19 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

Laughing-Man-Studios / Ski-Resort-Dashboard