jef / streetmerchant

🤖 The world's easiest, most powerful stock checker
https://jef.buzz/streetmerchant
MIT License
4.97k stars 1.3k forks source link

StreetMerchant Docker container causing random server reboots #1591

Closed shindouj closed 3 years ago

shindouj commented 3 years ago

I use StreetMerchant on Debian as a dockerized service. About twenty to sixty minutes after detecting a card that's in stock, my whole server (!) reboots. Both the container or the system logs contain absolutely no clues as to why it happens. I'm going to run StreetMerchant in debug mode and send you any logs if they're usable, but needless to say, the script is close to being unusable in this state, definitely not on any kind of server.

Parameters set:

STORES=alternate,amazon-de,amazon-uk,asus-de,euronics-de,mediamarkt,saturn
SHOW_ONLY_SERIES=3070,3080
DISCORD_WEB_HOOK=<redacted_webhook>
DISCORD_NOTIFY_GROUP_3070=<redacted_group_id>
DISCORD_NOTIFY_GROUP_3080=<redacted_group_id>
MAX_PRICE_SERIES_3080=1050
MAX_PRICE_SERIES_3070=800
SCREENSHOT=false
AUTO_ADD_TO_CART=false
BROWSER_TRUSTED=false
OPEN_BROWSER=false

Parameters SCREENSHOT, AUTO_ADD_TO_CART, BROWSER_TRUSTED and OPEN_BROWSER were set in response to these behaviors, but nothing has changed since I've set them.

EDIT: While setting the logger to debug, I've found that OPEN_BROWSER was set incorrectly to blank instead of false. I've set it to false and will update this report if it crashes again (or not).

Originally posted by @shindouj in https://github.com/jef/streetmerchant/issues/1557#issuecomment-753705054

wofnull commented 3 years ago

Could you provide some m ore information? -> Which debian Version? -> Which Hardware uses the Debian Server? -> how are the temps before the server reboots of CPU / System -> is the power supply sufficient? -> How does the underlying Debian Server behave when it is under full load ( 100% load over all cores )? Does it crash aswell after same period of time where the docker container leads to the formentioned crashs? ##########

Just asking because in the last few weeks many errors in this kind with frozen linux systems / crashed linux systems came up with to weak processors / general system slowness ( raspberry pis ) / to low memory / general issue leading back to bad cooling or bad / weak power supplys.

########## For example, i run this script directly ( not in docker ) on Debian 9.5 on a Ryzen 7 1800x / 64 GB RAM and had no issues running the script for > 24 hours

shindouj commented 3 years ago

Which debian Version? Which Hardware uses the Debian Server?

It's Debian 10 w/ 4.19.0-12 kernel, running on i7-4770 / 32 GB RAM.

how are the temps before the server reboots of CPU / System

I did not monitor the temperatures right before the crash as I don't have any monitoring installed there, I'll do that later today.

is the power supply sufficient?

I sure hope Hetzner installs sufficient power supplies in their servers. I have no information about what's installed there nor I ever inquired about it. I have no physical access to the machine.

How does the underlying Debian Server behave when it is under full load ( 100% load over all cores )? Does it crash aswell after same period of time where the docker container leads to the formentioned crashs?

However hysterical I find the fact that this simple script can put anything between 50 and 550% of CPU load on a fairly capable machine, I have never encountered any similar issues with this rig. I've been successfully throwing stuff at this machine without an issue for the past several months (including hosting various game servers) and it always ran fine. I'm gonna do some stress testing later and will let you know.

I have also checked the debug log after the latest crash and yeah, there's nothing there, either.

wofnull commented 3 years ago

ok since this is a really beefy system and aswell hosted at a well known server hoster ... i would aswell not asume temperature / power supply issues ... was more likely thinking on self hosted server on local lan ;)

However: Yes the script or better to say the way it works tends to pull a lot of computing power, since every call the script does opens a chromium process , parses the opened site , and closes it down. Depending on Items on your search list and count of shops the script searches, could lead to a huge amount of processes and therefore memory the system uses ( chromium / chrome is not known to be really memory efficient ).

As you are only limiting only on series, and a list of shops, it could be ( since there are alot variantions of 3080 / 3070s around ) that you are running into a memory limit.

shindouj commented 3 years ago

This might be a stupid question, but... why not curl? If the only thing this script does is fetching website, it could just as well do that by invoking curl with a fake useragent.

As you are only limiting only on series, and a list of shops, it could be ( since there are alot variantions of 3080 / 3070s around ) that you are running into a memory limit.

I refuse to believe 32 gigs of RAM is not enough to run this. Last time I checked it consumed well below that (around 5); not to mention that I have never seen a Linux server just die on the spot without leaving any log because of insufficient system memory.

I'm stress testing this server and while the thermals are atrocious for a desktop PC (CPU is at just below 90C in sustained 100% all core load), it's running stable for the past twenty minutes or so, not to mention that these kinds of temperatures are supposedly not unheard of in server environments. Since the script is nowhere near this kind of load, I doubt this is the issue as well.

What I'm going to do is to set a hard limit for StreetMerchant's RAM and CPU loads. If this does not help, it must be a StreetMerchant problem.

anthonytam commented 3 years ago

If you're still experiencing crashing after setting RAM and CPU limits, have you looked at the kernel logs? Perhaps this will contain information of killed processes / kernel panics. The only material difference I can think of between our systems is the use of a hypervisor and the age of the CPU. I would be interested in seeing if the kernel logs point to a specific crash culprit.

For context behind system performance, I am running the application with all Canadian retailers selected and all GPUs. The host machine has an Intel Xeon E3-1270 v6 with 6 of 8 cores assigned to the street merchant VM (I sit at ~40% utilization) and have 4GB of RAM assigned to the VM (I normally have 1 to 2GB of free RAM).

shindouj commented 3 years ago

have you looked at the kernel logs?

As I have mentioned previously, no system logs (kernel, syslog, messages, etc. etc.) contain any remark about the shutdown reason. Hetzner has performed a short review of my machine and they claim everything is up to their standards (including cooling).

I have since limited the Docker container to 2 CPU cores (it utilizes 100% of its limit) and to 8 GB of RAM (which it never even touched). It's been working for the last few hours without a hitch - I'll update this issue when it works stable for a day or so.

I still think such a trivial task should not involve so much system resources, but it is what it is. Would it be too much to ask you to clarify why you settled for using full Chromium instance(s) instead of a simpler HTTP(S) client like curl?

anthonytam commented 3 years ago

Sorry I missed that, I just wanted to ensure system logs included the kernel entries. Unfortunately I don't have and CPUs on hand around the same age so I'm not able to test a similar configuration to you; I haven't been able to replicate the issue in docker with what I have (Some older, some newer chips)

Would it be too much to ask you to clarify why you settled for using full Chromium instance(s) instead of a simpler HTTP(S) client like curl?

I can't speak to the initial implementation of the application, however I can mention my experiences with web scraping in the past

Unfortunately in the current state of most modern websites, JavaScript has become a requirement in order for most websites to load dynamic data. In the context of streetmerchant, without JS most sites would not load their stock numbers. Curl doesnt have capabilities of running any JS / modifying the DOM for any dynamic elements. A few years ago, PhantomJS was quite popular for this reason, however support for it has been quite up and down now. Currently, the only reliable way to perform any kind of web scraping for dynamic client context is to spin up a full browser engine and render the page allowing you to mimic interactions on the site (For example, some sites don't by default load the full page, you need to scroll down before more page data is requested in JS). It's bulky but unfortunate the most reliable.

benben commented 3 years ago

I am wondering myself but streetmerchant takes A LOT of resources. If I limit the CPUs through docker to anything less than 3, it starts failing and timing out. I can think of two ways a fluent typescript/JS ecosystem person could try: Spawn less chrome instances or reuse open chrome instances. It looks like it spins up a chrome for every request. Can this confirm someone?

wofnull commented 3 years ago

it actually work with pupeteer in the background, which only spawns one chrome instance and lets it stay open, spawning for every request a new thread.

Chromium itself is optimized for max performance and stability, using all available ressources by design, therefore if you open a new tab, it creates an additional (or more) process per tab and closes it when the tab closes. this leads to a strong system lag when many requests happen at once.

You can check this behaviour in a normal desktop enviroment running with your config and setting the headless option to false in there.

Only way around would be to switch from chrome to something way less performance hitting like curl / wget, which indeed would cause a big base overhaul of the script itself.

benben commented 3 years ago

Ok that makes sense. Thank you for giving that explanation! From the processes it really looks like a fully blown desktop chrome from a human with 100 abandoned tabs open 😂

I tried to find a setting where you can limit how many pages are opened a second or similar. Couldn't find anything beside playing with PAGE_SLEEP_MIN and PAGE_SLEEP_MAX but didn't really help.

It would be helpful to have some kind of throttling setting for streetmerchant: Only open a page every 10 seconds or similar. I tried to find that part in the code already but wasn't successful. Even a simple sleep might help already?

What do you think?

benben commented 3 years ago

Ok I tried again to read the code and it looks like there is no common shared pool which you could throttle and instead it does a timeout for all stores independently. This means shops do not request in sync and rather randmly depending on their response time, which makes it possible that at some point a lot of shops time out and try to request at the same time which will cause high load. Just imagine open your browser and it tries to load all open tabs at once.

With this, it looks like it would be hard to add some kind of throttling since you would need to keep track of how many requests where made in the last X seconds in a central place and then delay a new request further until below the limit.

Any other ideas?

DISCLAIMER: Not a JS/TS/node dev.

wofnull commented 3 years ago

As i have often the headless mode deactivated for testing, i can only say that most of the times ( with many stores / items added ) the script has only about 5-12 Tabs open, since it closes them down directly after the site was fully loaded. However the CPU Peaks are coming from the site loading itself depending on the content ( especially ads ) loaded, the chromium loads the site as it should be for a human to visit it ( foreground, loading everything, so that site scripts can load such things as stock / availability live on screen if needed ) ... however this leads to much more crap as only the site and the script that gets loaded .. xternal ressources like embedded videos ( casking for example does include embedded videos to youtube ) pictures and so on ... this takes up very much CPU power and cannot be mitigated ... even if there are only 5 Tabs open this will cost much more CPU power as needed. Another issue with Chrome / Chromium is the memory management: Closing a Tab does not directly free up memory it takes about 60-90 Seconds to free up ram for this browser. This is baked in and should lead to better performance for tabs using the same cached content as pictures or other stuff ... in cas of few stores this works well, but if many stores are included, the memory usage of chromium will raise enormously.

As said before switching to another browser for the autmated testing would be the best way to solve this, but aswell as said before: Huge change in codebase for streetmerchant would be the result.

benben commented 3 years ago

Can we tune Chrome to not load everything? Adblock things, avoid images and videos and external JS/CSS?

wofnull commented 3 years ago

not that i know of, pupeteer runs chromium in general in a profile less enviroment, therefore adblockers / tampermonkey / and so on cannot be loaded into the browser since on every restart of the script a manual integrated blocker would be removed again. this is the main problem, where a full blown desktop browser is used to mass check links in automation.

benben commented 3 years ago

With all the cycles spend on rendering endless websites we could mine coins instead and buy our cards like that lol. /offtopic

wofnull commented 3 years ago

In fact i do not do this for a new card, i have my 3090 fe from the first NVIDIA batch :)

I want only to support @jef for his great work on this and make it at least possible for a few people to buy a card the way they want it ;)

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

github-actions[bot] commented 3 years ago

This issue has been closed because it is stale. Reopen if necessary.