Failing to download core agent if it already exists

asgrim commented 2 years ago

production.ERROR: [Scout] Failed to create directory "/tmp/scout_apm_core/scout_apm_core-v1.3.0-x86_64-unknown-linux-musl": mkdir(): File exists {"exception":"[object] (ErrorException(code: 0): mkdir(): File exists at /app/vendor/scoutapp/scout-apm-php/src/CoreAgent/Downloader.php:128)
2022-02-24T21:28:48.409086+00:00 app[web.14]: #2 /app/vendor/scoutapp/scout-apm-php/src/CoreAgent/Downloader.php(101): Scoutapm\\CoreAgent\\Downloader->createCoreAgentDir()
2022-02-24T21:28:48.409313+00:00 app[web.14]: #3 /app/vendor/scoutapp/scout-apm-php/src/CoreAgent/AutomaticDownloadAndLaunchManager.php(63): Scoutapm\\CoreAgent\\Downloader->download()
2022-02-24T21:28:48.409510+00:00 app[web.14]: #4 /app/vendor/scoutapp/scout-apm-php/src/Agent.php(214): Scoutapm\\CoreAgent\\AutomaticDownloadAndLaunchManager->launch()
2022-02-24T21:28:48.409692+00:00 app[web.14]: #5 /app/vendor/scoutapp/scout-apm-laravel/src/Middleware/SendRequestToScout.php(30): Scoutapm\\Agent->connect()
2022-02-24T21:28:48.409994+00:00 app[web.14]: #6 /app/vendor/laravel/framework/src/Illuminate/Pipeline/Pipeline.php(171): Scoutapm\\Laravel\\Middleware\\SendRequestToScout->handle(Object(Illuminate\\Http\\Request), Object(Closure))

Looks like there is a bug here where if the path already exists, the agent download process crashes. I would've thought we'd already handled existing directory, but investigate further - perhaps permissions means we can't overwrite or something?

asgrim commented 2 years ago

Note: this is definitely an older version of scout-apm-php since the scout-apm-laravel code exists. Not sure which version is being used though.

asgrim commented 2 years ago

Further info; changing the env var may be the cause if this, depending on how Heroku behaves when this happens; perhaps it kills the running process, but leaves the /tmp path in tact?

asgrim commented 2 years ago

Hmm, did some investigation this afternoon, I'm not sure changing the env var is the cause here; I tried this in Heroku and it did fully restart the node:

2022-03-02T14:51:58.871977+00:00 heroku[worker.1]: Stopping all processes with SIGTERM
2022-03-02T14:51:59.303098+00:00 heroku[worker.1]: Process exited with status 143
2022-03-02T14:51:59.443577+00:00 heroku[worker.1]: Starting process with command `php artisan queue:work`
2022-03-02T14:51:59.290285+00:00 app[web.1]: SIGTERM received, attempting graceful shutdown...
2022-03-02T14:51:59.291008+00:00 app[web.1]: Stopping httpd gracefully...
2022-03-02T14:51:58.895539+00:00 heroku[web.1]: Stopping all processes with SIGTERM
2022-03-02T14:52:00.121932+00:00 heroku[worker.1]: State changed from starting to up
2022-03-02T14:52:00.156345+00:00 heroku[web.1]: Starting process with command `vendor/bin/heroku-php-apache2 public/`
2022-03-02T14:52:00.350309+00:00 app[web.1]: Stopping php-fpm gracefully...
2022-03-02T14:52:00.471076+00:00 app[web.1]: Shutdown complete.
2022-03-02T14:52:00.662258+00:00 heroku[web.1]: Process exited with status 143
2022-03-02T14:51:57.638808+00:00 heroku[web.1]: Restarting
2022-03-02T14:51:57.679338+00:00 heroku[web.1]: State changed from up to starting
2022-03-02T14:51:57.713450+00:00 heroku[worker.1]: Restarting
2022-03-02T14:51:57.737295+00:00 heroku[worker.1]: State changed from up to starting
2022-03-02T14:52:01.078276+00:00 app[worker.1]: [2022-03-02 14:52:01] local.DEBUG: [Scout] Laravel Scout Agent is starting
2022-03-02T14:52:01.078289+00:00 app[worker.1]:

Perhaps a general "crash" of the core-agent may reproduce this, but I need to simulate the process going away. Will try that next.

asgrim commented 2 years ago

Hmm, still unable to reproduce this. On my test Heroku instance I added an HTTP endpoint that kills the running core-agent (on the web node, not an ephemeral CLI node!), but even with the core-agent simply killed, leaving the /tmp path in tact, the agent detects the manifest and sees that everything is fine, and launches the Core Agent:

[2022-03-08 09:55:06] local.DEBUG: [Scout] Laravel Scout Agent is starting
[2022-03-08 09:55:06] local.DEBUG: [Scout] Configuration: {"monitor":true,"name":"Laravel Blog Heroku","key":"<redacted>","log_level":"DEBUG","log_payload_content":false,"api_version":"1.0","ignore":[],"application_root":null,"scm_subdirectory":null,"revision_sha":null,"hostname":null,"disabled_instruments":null,"core_agent_log_level":"DEBUG","core_agent_log_file":null,"core_agent_config_file":null,"core_agent_socket_path":"tcp:\/\/127.0.0.1:6590","core_agent_dir":"\/tmp\/scout_apm_core","core_agent_full_name":"scout_apm_core-v1.4.0-x86_64-unknown-linux-musl","core_agent_download_url":"https:\/\/s3-us-west-1.amazonaws.com\/scout-public-downloads\/apm_core_agent\/release","core_agent_launch":true,"core_agent_download":true,"core_agent_version":"v1.4.0","core_agent_triple":"x86_64-unknown-linux-musl","core_agent_permissions":511,"uri_reporting":"path","uri_filtered_params":["access","access_token","api_key","apikey","auth","auth_token","card","certificate","credentials","crypt","key","mysql_pwd","otp","passwd","password","private","protected","salt","secret","ssn","stripetoken","token"],"errors_enabled":false,"errors_ignored_exceptions":[],"errors_host":"https:\/\/errors.scoutapm.com","errors_batch_size":5,"errors_filtered_params":["password","s3-key","scout_key"]}
[2022-03-08 09:55:06] local.INFO: [Scout] Scout Core Agent (app=Laravel Blog Heroku, ext=n/a) not connected yet, attempting to start
[2022-03-08 09:55:06] local.INFO: [Scout] Parsing Core Agent Manifest at "/tmp/scout_apm_core/scout_apm_core-v1.4.0-x86_64-unknown-linux-musl/manifest.json"
[2022-03-08 09:55:06] local.INFO: [Scout] Parsing Core Agent Manifest at "/tmp/scout_apm_core/scout_apm_core-v1.4.0-x86_64-unknown-linux-musl/manifest.json"
[2022-03-08 09:55:06] local.DEBUG: [Scout] exec is available
[2022-03-08 09:55:06] local.DEBUG: [Scout] Core Agent Launch in Progress
[2022-03-08 09:55:06] local.DEBUG: [Scout] Launching core agent with command: '/tmp/scout_apm_core/scout_apm_core-v1.4.0-x86_64-unknown-linux-musl/core-agent' 'start' '--daemonize' 'true' '--log-file' '/dev/null' '--log-level' 'DEBUG' '--tcp' '127.0.0.1:6590'
[2022-03-08 09:55:06] local.WARNING: [Scout] Failed to connect to socket on address "tcp://127.0.0.1:6590", previous message: socket_connect(): unable to connect [111]: Connection refused
[2022-03-08 09:55:06] local.DEBUG: [Scout] Handle MiddlewareInstrument
[2022-03-08 09:55:06] local.DEBUG: [Scout] Handle ActionInstrument
[2022-03-08 09:55:09] local.DEBUG: [Scout] Connected to connector whilst sending.
[2022-03-08 09:55:09] local.DEBUG: [Scout] Skipping metadata send, already sent
[2022-03-08 09:55:09] local.DEBUG: [Scout] Sending metrics from 3 collected spans.
[2022-03-08 09:55:09] local.DEBUG: [Scout] Sent whole payload successfully to core agent.
[2022-03-08 09:55:09] local.DEBUG: [Scout] SendRequestToScout succeeded

Note: the WARNING about failed to connect is expected, as it takes a moment for core-agent to launch.

Not sure how else to reproduce this issue currently.

scoutapp / scout-apm-php

Failing to download core agent if it already exists #257