Closed fabianospinelli closed 1 year ago
Hi @fabianospinelli,
Did you try to run command as non-root user? I do not see the recommended instruction in your dockerfile:
…
# Run everything after as non-privileged user.
USER pptruser
…
Yes, I tried also that solution. Here the Dockerfile I used to build the image:
FROM ubuntu:latest
WORKDIR /root
RUN apt-get update \
&& apt-get upgrade -y \
&& apt install -y curl \
&& apt install -y git \
&& apt-get install -y chromium-browser
RUN curl -fsSL https://deb.nodesource.com/setup_19.x | bash - \
&& apt install -y nodejs
# Add user so we don't need --no-sandbox.
RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& chown -R pptruser:pptruser /home/pptruser
RUN npm install --save @opentermsarchive/engine
RUN apt-get install -y libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon-x11-0 libxcomposite-dev \
libxdamage1 libxrandr2 libpangocairo-1.0-0 libasound2 libgbm-dev libnss3
WORKDIR /home/pptruser
RUN mkdir /home/pptruser/open-terms-archive \
&& mkdir /home/pptruser/open-terms-archive/declarations \
&& mkdir /home/pptruser/open-terms-archive/config \
&& cd /home/pptruser/open-terms-archive
COPY declarations.json /home/pptruser/open-terms-archive/declarations
COPY default.json /home/pptruser/open-terms-archive/config
# Run everything after as non-privileged user.
USER pptruser
After the build has successful, if I run the Open Terms Archive from /home/pptruser/open-terms-archive I obtain the following error:
pptruser@f0a397eb5789:~/open-terms-archive$ npx ota track
npm ERR! could not determine executable to run
npm ERR! A complete log of this run can be found in:
npm ERR! /home/pptruser/.npm/_logs/2023-03-29T09_25_59_823Z-debug-0.log
I also tried to update "npm" to the latest version adding this statement to the Dockerfile:
RUN npm install -g npm@latest
but the result is always the same. Any suggestions?
The errors you have in the latest message is due to the fact that the engine was installed outside the open-terms-archive
directory. It can be fixed by installing the engine within the open-terms-archive
directory with:
WORKDIR /home/pptruser/open-terms-archive
RUN npm install @opentermsarchive/engine
But this won't solve the problem of running puppeteer
in Docker.
And I'm sorry but I tried for more than two hours and I did not succeed to get things to work.
If you succeed on your side, it would be nice to share your Dockerfile with us, otherwise I suggest you take a look at our Ansible recipes to fully setup a server with the Open Terms Archive engine.
Hi Nicolas, you right, so I moved the installation statement in the right place. Moreover, I added also another statement to change the ownership of the new installed files to the user "pptruser". Here the new Dockerfile:
FROM ubuntu:latest
WORKDIR /root
RUN apt-get update \
&& apt-get upgrade -y \
&& apt install -y curl \
&& apt install -y git \
&& apt-get install -y chromium-browser
RUN curl -fsSL https://deb.nodesource.com/setup_19.x | bash - \
&& apt install -y nodejs
# Add user so we don't need --no-sandbox.
RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& chown -R pptruser:pptruser /home/pptruser
RUN mkdir /home/pptruser/open-terms-archive \
&& mkdir /home/pptruser/open-terms-archive/declarations \
&& mkdir /home/pptruser/open-terms-archive/config \
&& cd /home/pptruser/open-terms-archive
WORKDIR /home/pptruser/open-terms-archive
RUN npm install --save @opentermsarchive/engine \
&& npm install -g npm@latest
RUN apt-get install -y libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon-x11-0 libxcomposite-dev \
libxdamage1 libxrandr2 libpangocairo-1.0-0 libasound2 libgbm-dev libnss3
RUN chown -R pptruser:pptruser /home/pptruser/open-terms-archive/
WORKDIR /home/pptruser
COPY declarations.json /home/pptruser/open-terms-archive/declarations
COPY default.json /home/pptruser/open-terms-archive/config
# Run everything after as non-privileged user.
USER pptruser
It seems I resolved a part of the problem but executing the Open Terms Archive I obtain this new error:
pptruser@1366670c10d6:~/open-terms-archive$ npx --no-sandbox ota track
2023-03-29 15:26:57 info Start Open Terms Archive
2023-03-29 15:26:57 info Examining 1 documents from 1 services for refiltering…
2023-03-29 15:26:58 info Examined 1 documents from 1 services for refiltering
2023-03-29 15:26:58 info Recorded 0 new versions
2023-03-29 15:26:58 info Tracking changes of 1 documents from 1 services…
2023-03-29 15:26:58 error unhandledRejection: Failed to launch the browser process!
[0329/152658.370405:FATAL:zygote_host_impl_linux.cc(117)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/main/docs/linux/suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
#0 0x559ca51e4339 base::debug::CollectStackTrace()
#1 0x559ca515af23 base::debug::StackTrace::StackTrace()
#2 0x559ca5158070 logging::LogMessage::~LogMessage()
#3 0x559ca3158c2b content::ZygoteHostImpl::Init()
#4 0x559ca4cd5c0f content::ContentMainRunnerImpl::Initialize()
#5 0x559ca4cd3bfd content::RunContentProcess()
#6 0x559ca4cd3d4e content::ContentMain()
#7 0x559ca4d2b20a headless::(anonymous namespace)::RunContentMain()
#8 0x559ca4d2af15 headless::HeadlessShellMain()
#9 0x559ca160c1e3 ChromeMain
#10 0x7f469d679d90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
#11 0x7f469d679e40 __libc_start_main
#12 0x559ca160c02a _start
TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md
Error: Failed to launch the browser process!
[0329/152658.370405:FATAL:zygote_host_impl_linux.cc(117)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/main/docs/linux/suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
#0 0x559ca51e4339 base::debug::CollectStackTrace()
#1 0x559ca515af23 base::debug::StackTrace::StackTrace()
#2 0x559ca5158070 logging::LogMessage::~LogMessage()
#3 0x559ca3158c2b content::ZygoteHostImpl::Init()
#4 0x559ca4cd5c0f content::ContentMainRunnerImpl::Initialize()
#5 0x559ca4cd3bfd content::RunContentProcess()
#6 0x559ca4cd3d4e content::ContentMain()
#7 0x559ca4d2b20a headless::(anonymous namespace)::RunContentMain()
#8 0x559ca4d2af15 headless::HeadlessShellMain()
#9 0x559ca160c1e3 ChromeMain
#10 0x7f469d679d90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
#11 0x7f469d679e40 __libc_start_main
#12 0x559ca160c02a _start
TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md
at onClose (/home/pptruser/open-terms-archive/node_modules/puppeteer/lib/cjs/puppeteer/node/BrowserRunner.js:255:20)
at Interface.<anonymous> (/home/pptruser/open-terms-archive/node_modules/puppeteer/lib/cjs/puppeteer/node/BrowserRunner.js:248:68)
at Interface.emit (node:events:524:35)
at Interface.close (node:internal/readline/interface:534:10)
at Socket.onend (node:internal/readline/interface:260:10)
at Socket.emit (node:events:524:35)
at endReadableNT (node:internal/streams/readable:1359:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
How can I disable the sandbox? Have you an idea? However, I will try to use also your Ansible solution. I tried just looking at the Ansible scripts and saw that they use Docker. So what's the difference between using this solution and using Docker directly?
Disabling the sandbox is strongly discouraged and currently it is not possible without modifying the engine.
It can only be done when the puppeteer
browser is instantiated with options --no-sandbox
and --disable-setuid-sandbox
like this:
const browser = await puppeteer.launch({ args: ['--no-sandbox', '--disable-setuid-sandbox'] });
I tried just looking at the Ansible scripts and saw that they use Docker.
I'm not sure to understand what you mean here because as far as I know, Ansible does not use Docker.
So what's the difference between using this solution and using Docker directly?
Docker and Ansible serve different purposes:
For example, to highlight the difference, Ansible can be used to manage and deploy Docker containers.
Hi, finally I managed to solve the problem with Docker. Now the Dockerfile below manages to create a fully functional image of OpenTermsArchive. Over the next few days I will also provide the Docker composer and the files (json, .env, etc) that are copied during the image build phase to configure the environment (declarations, git, etc.)
FROM ubuntu:latest
WORKDIR /root
RUN apt-get update \
&& apt-get upgrade -y \
&& apt install -y curl \
&& apt install -y git \
&& apt-get install -y chromium-browser
RUN curl -fsSL https://deb.nodesource.com/setup_19.x | bash - \
&& apt install -y nodejs
# Add user so we don't need --no-sandbox.
RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& chown -R pptruser:pptruser /home/pptruser
RUN apt-get update \
&& apt-get install -y wget gnupg \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor -o /usr/share/keyrings/googlechrome-linux-keyring.gpg \
&& sh -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/googlechrome-linux-keyring.gpg] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-khmeros fonts-kacst fonts-freefont-ttf libxss1 \
--no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir /home/pptruser/open-terms-archive \
&& mkdir /home/pptruser/open-terms-archive/declarations \
&& mkdir /home/pptruser/open-terms-archive/config \
&& cd /home/pptruser/open-terms-archive
WORKDIR /home/pptruser/open-terms-archive
RUN npm install --save @opentermsarchive/engine \
&& npm install -g npm@latest
COPY declarations/OpenTermsArchive.json /home/pptruser/open-terms-archive/declarations
COPY declarations/Siretessile.json /home/pptruser/open-terms-archive/declarations
COPY default.json /home/pptruser/open-terms-archive/config
COPY env /home/pptruser/open-terms-archive/.env
RUN chown -R pptruser:pptruser /home/pptruser/open-terms-archive/
WORKDIR /home/pptruser
# Run everything after as non-privileged user.
USER pptruser
RUN npm i puppeteer \
&& (node -e "require('child_process').execSync(require('puppeteer').executablePath() + ' --credits', {stdio: 'inherit'})" > THIRD_PARTY_NOTICES)
Hi @fabianospinelli, Well done for finally solving this issue 👍. If you could create a public repository of a fully functional OTA configuration with Docker, we would be happy to reference it in the documentation for users who want to use Docker 🙂.
Congratulations for getting Open Terms Archive running with Docker! 😃 I understand that this issue has been solved and will close it now 🙂
Hi @fabianospinelli! It's been a month since you indicated you managed to run Open Terms Archive with Docker and expressed your intention to share the files necessary to that end 🙂 Can we help you with this publishing process?
Hi @MattiSG and thanks for your reply. I'm happy to be able to contribute to OTA with the Docker part we developed. Let me know how to share what we have done and I will do it in these days.
Hi @fabianospinelli, Can you create a public GitHub repository containing your working Dockerfile with instructions on how to use it to run an Open Terms Archive engine and how to configure it?
Dear all, I'm trying to create a Docker image with Open Terms Archive for the Joint Research Centre in Ispra (VA). First of all, I installed it on a Mac machine and everything worked very well. I tested it with some declarations files and I haven't encountered any kind of problem. The problems arose when I tried to build a Docker image. I prepared this Dockerfile starting from the latest version of Ubuntu:
After the built process, trying to run the Open Terms Archive engine, I obtained this error:
So, I tried to verify which libraries are missing with the following command:
So, I installed all the missing libraries with the following commands (the idea is to integrate in the Dockerfile):
At this point, if I try to run the Open Terms Archive engine I obtain another error and I cannot understand how to resolve it. Could you help me?
I also tried to follow the information provided in the Troubleshooting page about Puppeteer, but without any good result: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md