Closed vpatrikeev closed 3 years ago
Hello @vpatrikeev. I'm not sure, that I completely understand you. But some observations:
allocation_timeout
time and send you an error, that worker can not be allocated, because it's still blocked internally.Stop
do you mean?@48d90782 Забыл, что ребята вы наши и можно по-русски) Да, конфигурации пока дефолтный для многих настроек. Вопрос в следующем - стоит max_jobs: 1. То есть после 1 запроса варке должен остановиться. А новый воркер на замену не поднимается?
(по русски тут нельзя, прости, вдруг проблема будет еще у кого, поэтому язык общения в тикете - английский)
@48d90782 Забыл, что ребята вы наши и можно по-русски) Да, конфигурации пока дефолтный для многих настроек. Вопрос в следующем - стоит max_jobs: 1. То есть после 1 запроса варке должен остановиться. А новый воркер на замену не поднимается?
If you want to discuss some point, welcome to our discord server in the roadrunner
-> russian
channel.
@48d90782 Забыл, что ребята вы наши и можно по-русски) Да, конфигурации пока дефолтный для многих настроек. Вопрос в следующем - стоит max_jobs: 1. То есть после 1 запроса варке должен остановиться. А новый воркер на замену не поднимается?
This question is (English):
max_jobs
is set to 1. After the request, the worker should be stopped. Does a new worker for a replacement get up?
Yes, the old worker, after the response, should be terminated (killed) and replaced with the new one.
@vpatrikeev Also, could you please update your RR2 to version 2.0.3, because we updated the mechanism of killing workers by max_jobs reached event.
As I can see, it appears not to get up. (I updated RR to 2.0.3)
Here is my docker-compose
version: '3.2'
services:
app: &app-service
build:
context: .
dockerfile: ./Dockerfile
container_name: "${SERVICE_SLUG:-gb_service_template_app}"
restart: on-failure
volumes:
- ./docker/app-entrypoint.sh:/app-entrypoint.sh:ro
- .:/app
env_file:
- .env
ports:
- "${APP_PORT:-80}:80"
- "${APP_PORT_SSL:-443}:443"
command: 'rr serve -c /app/.rr.local.yml'
networks:
- app-network
- gb-service
queue:
<<: *app-service
container_name: "${SERVICE_SLUG_QUEUE:-gb_service_template_queue}"
environment:
STARTUP_DELAY: 5
command: 'php /app/artisan queue:work --memory=64'
ports: []
networks:
app-network:
driver: bridge
gb-service:
name: 'gb-service'
driver: bridge
And Dockerfile
FROM composer:2 AS composer
FROM spiralscout/roadrunner:2.0.3 AS roadrunner
FROM php:8.0.3-alpine
ENV \
COMPOSER_ALLOW_SUPERUSER="1" \
COMPOSER_HOME="/tmp/composer" \
OWN_SSL_CERT_DIR="/ssl-cert" \
OWN_SSL_CERT_LIFETIME=1095 \
PS1='\[\033[1;32m\]\[\033[1;36m\][\u@\h] \[\033[1;34m\]\w\[\033[0;35m\] \[\033[1;36m\]# \[\033[0m\]'
# persistent / runtime deps
ENV PHPIZE_DEPS \
build-base \
autoconf \
libc-dev \
pcre-dev \
openssl \
pkgconf \
cmake \
make \
file \
re2c \
g++ \
gcc
# repmanent deps
ENV PERMANENT_DEPS \
postgresql-dev \
git \
openssh
COPY --from=composer /usr/bin/composer /usr/bin/composer
COPY --from=roadrunner /usr/bin/rr /usr/bin/rr
RUN set -xe \
&& apk add --no-cache ${PERMANENT_DEPS} \
&& apk add --no-cache --virtual .build-deps ${PHPIZE_DEPS} \
# https://github.com/docker-library/php/issues/240
&& apk add --no-cache --repository http://dl-3.alpinelinux.org/alpine/edge/community gnu-libiconv \
&& docker-php-ext-configure opcache --enable-opcache \
&& docker-php-ext-configure pdo_pgsql \
&& docker-php-ext-configure bcmath --enable-bcmath \
&& docker-php-ext-install -j$(nproc) \
sockets \
pdo_pgsql \
opcache \
bcmath \
&& pecl install -o -f redis \
&& rm -rf /tmp/pear \
&& docker-php-ext-enable redis \
&& ( mkdir -p ${OWN_SSL_CERT_DIR} \
&& cd ${OWN_SSL_CERT_DIR} \
&& openssl genrsa -passout pass:x -out self-signed.key 2048 \
&& cp self-signed.key self-signed.key.orig \
&& openssl rsa -passin pass:x -in self-signed.key.orig -out self-signed.key \
&& openssl req -new -key self-signed.key -out cert.csr \
-subj "/C=RU/ST=RU/L=Somewhere/O=SomeOrg/OU=IT Department/CN=example.com" \
&& openssl x509 -req -days ${OWN_SSL_CERT_LIFETIME} -in cert.csr -signkey self-signed.key -out self-signed.crt \
&& ls -lh ) \
&& apk del .build-deps \
&& rm -rf /app /home/user ${COMPOSER_HOME} /var/cache/apk/* \
&& mkdir /app /home/user ${COMPOSER_HOME} \
&& ln -s /usr/bin/composer /usr/bin/c
COPY ./docker/php.ini /usr/local/etc/php/php.ini
COPY ./docker/opcache.ini /usr/local/etc/php/conf.d/opcache.ini
COPY ./docker/app-entrypoint.sh /app-entrypoint.sh
WORKDIR /app
COPY ./auth.* /app/
COPY ./composer.* /app/
RUN set -xe \
&& composer install --no-interaction --no-ansi --prefer-dist --no-scripts --no-autoloader \
&& composer install --no-interaction --no-ansi --prefer-dist --no-scripts --no-dev\
&& chmod -R 777 /home/user ${COMPOSER_HOME}
COPY . /app
RUN set -xe \
&& chmod +x /app-entrypoint.sh
EXPOSE 80
EXPOSE 443
# DO NOT OVERRIDE ENTRYPOINT IF YOU CAN AVOID IT! @see <https://github.com/docker/docker.github.io/issues/6142>
ENTRYPOINT ["/app-entrypoint.sh"]
CMD ["rr", "serve", "-c", "/app/.rr.yml"]
It is all built on top of Laravel 8 app using your roadrunner-laravel package
@vpatrikeev Could you please turn on debug
mode in the logs:
logs:
mode: development
level: debug
encoding: console
output: stdout
err_output: stdout
And paste here the error output?
@tarampampam Could you please help me with the roadrunner-laravel
package?
gb_admin | 2021-04-01T12:04:08.000Z WARN server server/plugin.go:208 no free workers in pool {"error": "static_pool_exec_with_context: NoFreeWorkers:\n\tworker_watcher_get_free_worker: no free workers in the container, timeout exceed"} gb_admin | github.com/spiral/roadrunner/v2/plugins/server.(Plugin).collectEvents gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/plugins/server/plugin.go:208 gb_admin | github.com/spiral/roadrunner/v2/pkg/events.(HandlerImpl).Push gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/events/general.go:37 gb_admin | github.com/spiral/roadrunner/v2/pkg/pool.(StaticPool).getWorker gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/pool/static_pool.go:229 gb_admin | github.com/spiral/roadrunner/v2/pkg/pool.(StaticPool).execWithTTL gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/pool/static_pool.go:176 gb_admin | github.com/spiral/roadrunner/v2/pkg/pool.(*supervised).Exec.func1 gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/pool/supervisor_pool.go:99
First 4 requests were successful, due to num_workers: 4 But after each I see in console: 582Z ERROR server server/plugin.go:235 worker_watcher_wait: signal: killed; process_wait: signal: killed gb_admin | github.com/spiral/roadrunner/v2/plugins/server.(Plugin).collectEvents gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/plugins/server/plugin.go:235 gb_admin | github.com/spiral/roadrunner/v2/pkg/events.(HandlerImpl).Push gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/events/general.go:37 gb_admin | github.com/spiral/roadrunner/v2/pkg/worker_watcher.(workerWatcher).wait gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/worker_watcher/worker_watcher.go:235 gb_admin | github.com/spiral/roadrunner/v2/pkg/worker_watcher.(workerWatcher).Watch.func1 gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/worker_watcher/worker_watcher.go:44 gb_admin | 2021-04-01T12:03:52.294Z DEBUG server server/plugin.go:220 worker constructed {"pid": 55} gb_admin | 2021-04-01T12:03:52.651Z DEBUG http http/plugin.go:119 200 GET http://0.0.0.0:10081/nova/login {"remote": "172.20.0.1", "elapsed": "592.3817ms"} gb_admin | 2021-04-01T12:03:52.659Z ERROR server server/plugin.go:235 worker_watcher_wait: signal: killed; process_wait: signal: killed gb_admin | github.com/spiral/roadrunner/v2/plugins/server.(Plugin).collectEvents gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/plugins/server/plugin.go:235 gb_admin | github.com/spiral/roadrunner/v2/pkg/events.(HandlerImpl).Push gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/events/general.go:37 gb_admin | github.com/spiral/roadrunner/v2/pkg/worker_watcher.(workerWatcher).wait gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/worker_watcher/worker_watcher.go:235 gb_admin | github.com/spiral/roadrunner/v2/pkg/worker_watcher.(workerWatcher).Watch.func1 gb_admin | github.com/spiral/roadrunner/v2@v2.0.3/pkg/worker_watcher/worker_watcher.go:44
I guess, that you don't free up the worker. So, it's blocked inside your PHP worker code and not sending a response back. If that possible, could you paste here (for sure, without sensitive information) your worker code?
It is a standard Laravel-roadrunner worker:
#!/usr/bin/env php
<?php
declare(strict_types=1);
\define('RR_WORKER_START', \microtime(true));
\ini_set('display_errors', 'stderr');
/*
|--------------------------------------------------------------------------
| Register The Auto Loader
|--------------------------------------------------------------------------
|
| Composer provides a convenient, automatically generated class loader
| for our application. We just need to utilize it! We'll require it
| into the script here so that we do not have to worry about the
| loading of any our classes "manually". Feels great to relax.
|
*/
$loaded = false;
foreach (['../../..', '../..', '..', 'vendor', '../vendor', '../../vendor'] as $path) {
if (\is_file($autoload_file = __DIR__ . '/' . $path . '/autoload.php')) {
require $autoload_file;
$loaded = true;
break;
}
}
if ($loaded !== true) {
\fwrite(\STDERR, 'Composer autoload file was not found. Try to install project dependencies' . PHP_EOL);
exit(1);
}
/*
|--------------------------------------------------------------------------
| Find Application Base Path
|--------------------------------------------------------------------------
|
| This file can be located in package `./bin` directory, used as a symbolic
| link or something else. In this case we will try to find application
| base directory using the most obvious application locations.
|
*/
/** @var string|null $base_path */
$base_path = null;
foreach (['../../../..', '../../..', '../..', '..', '../vendor/laravel/laravel'] as $path) {
if (\is_file(__DIR__ . '/' . $path . '/bootstrap/app.php')) {
$base_path = (string) \realpath(__DIR__ . '/' . $path);
break;
}
}
/*
|--------------------------------------------------------------------------
| Create And Run Console Application
|--------------------------------------------------------------------------
|
| Symfony console component is a nice wrapper around worker CLI options.
|
*/
$app = new \Symfony\Component\Console\Application(
'RoadRunner worker',
\Composer\InstalledVersions::getPrettyVersion('spiral/roadrunner-laravel')
);
$app->add(new \Spiral\RoadRunnerLaravel\Console\Commands\StartCommand(
new \Spiral\RoadRunnerLaravel\Worker(),
$base_path
));
$app->run();
Do you receive responses for the first 4 requests (if num_workers=4)?
Yes i do
If you receive a response, it's just not possible to block after that. If the RR sees, that MaxNumJobs reached, it kills the worker and it's not possible from the process POV to bypass os.Kill
signal. You should see in the logs (debug) Worker destructed
-> Worker constructed
.
We can help you, but we need the full sample of code, which we can run and reproduce the issue. Some technical details:
ReachedMaxJobs
state) it's killed: https://github.com/spiral/roadrunner/blob/master/pkg/worker_watcher/worker_watcher.go#L177For sure, I'll try to reproduce this behavior, but it seems, that the problem is somewhere else.
I tried to reproduce the problem, but without success (inside package spiral/roadrunner-laravel v4.0.1
). RR config:
rpc:
listen: tcp://127.0.0.1:6001
server:
command: "php ./bin/rr-worker start --refresh-app --relay-dsn tcp://127.0.0.1:6666"
relay: "tcp://127.0.0.1:6666"
env:
- APP_KEY: "base64:+hY09mlBag/d7Qhq2SjE/i2iUzZBS1dGObLqcHZU2Ac="
relay_timeout: 10s
http:
address: 0.0.0.0:22622
middleware: ["headers", "static", "gzip"]
pool:
num_workers: 2
max_jobs: 1
supervisor:
exec_ttl: 5s
static:
dir: "./vendor/laravel/laravel/public"
forbid: [".php"]
And then in 3 terminals next was executed. In first:
$ rr serve -c ./.rr.yaml
...
2021-04-01T13:08:22.434Z DEBUG container/serve.go:20 called Serve on the vrtx {"vrtx id": "http.Plugin"}
2021-04-01T13:08:22.434Z DEBUG container/serve.go:14 preparing to calling Serve on the Vertex {"vrtx id": "rpc.Plugin"}
2021-04-01T13:08:22.434Z DEBUG RPC rpc/plugin.go:86 Started RPC service {"address": "tcp://127.0.0.1:6001", "services": ["informer", "resetter"]}
2021-04-01T13:08:22.434Z DEBUG container/serve.go:20 called Serve on the vrtx {"vrtx id": "rpc.Plugin"}
2021-04-01T13:08:27.501Z DEBUG server server/plugin.go:222 worker destructed {"pid": 309}
2021-04-01T13:08:28.578Z DEBUG server server/plugin.go:220 worker constructed {"pid": 326}
2021-04-01T13:08:28.578Z DEBUG http http/plugin.go:119 200 GET http://127.0.0.1:22622/ {"remote": "127.0.0.1", "elapsed": "1.112063093s"}
2021-04-01T13:08:34.895Z DEBUG server server/plugin.go:222 worker destructed {"pid": 316}
2021-04-01T13:08:35.970Z DEBUG server server/plugin.go:220 worker constructed {"pid": 336}
2021-04-01T13:08:35.970Z DEBUG http http/plugin.go:119 200 GET http://127.0.0.1:22622/ {"remote": "127.0.0.1", "elapsed": "1.127410345s"}
Second (many-many times):
$ curl http://127.0.0.1:22622
...
<div class="ml-4 text-center text-sm text-gray-500 sm:text-right sm:ml-0">
Laravel v8.32.1 (PHP v8.0.3)
</div>
</div>
</div>
</div>
</body>
</html>
And third:
$ rr workers -i
+---------+-----------+---------+---------+--------------------+
| PID | STATUS | EXECS | MEMORY | CREATED |
+---------+-----------+---------+---------+--------------------+
| 326 | ready | 0 | 23 MB | 1 minute ago |
| 336 | ready | 0 | 23 MB | 1 minute ago |
+---------+-----------+---------+---------+--------------------+
And all works as expected. Maybe problem on your application side? Maybe something blocks execution process?
I am trying on a fresh laravel app with laravel/nova installed. If max_jobs=0, everything works fine
I see no worker destructed logs
@tarampampam I found one interesting moment. My worker relay dsn is just the same as rpc, as you can see in my config
Can it be the case?
Of course :D Change them!
@pvs9 Did it work for you?
@48d90782 Yes, now workers constructing as needed
We have a local rr build to run our services with such config file
Actually, it runs Laravel bridged workers. Due to Laravel Nova not compatible with async (it spawns a lot of DB connections and stores some settings in static vars outside the app container), we decided to set max_jobs: 1 until problem is solved. I expected to see this happen: explanation
As for now, I receive error in browser:
Errortrace, Backtrace or Panictrace