konveyor / kantra

A CLI that unifies analysis and transformation capabilities of Konveyor
Apache License 2.0
9 stars 24 forks source link

[BUG] MacOS: too many open files #111

Open jwmatthews opened 10 months ago

jwmatthews commented 10 months ago

Is there an existing issue for this?

Konveyor version

Latest build as of early November

Priority

Major

Current Behavior

Seeing:

INFO[0000] running source code analysis                  args="--provider-settings=/opt/input/config/settings.json --rules=/opt/rulesets/ --output-file=/opt/output/output.yaml --context-lines=100 --dep-label-selector=(!konveyor.io/dep-source=open-source) --verbose=4 --label-selector=((konveyor.io/target=quarkus || konveyor.io/target=jakarta-ee || konveyor.io/target=cloud-readiness) && konveyor.io/source) || (discovery)" input=/Users/jmatthews/git/jwmatthews/kyma_poc/data/coolstuff-javaee log=/Users/jmatthews/git/jwmatthews/kyma_poc/data/example_reports/coolstuff-javaee/analysis.log output=/Users/jmatthews/git/jwmatthews/kyma_poc/data/example_reports/coolstuff-javaee volumes="{\"/Users/jmatthews/git/jwmatthews/kyma_poc/data/coolstuff-javaee\":\"/opt/input/source\",\"/Users/jmatthews/git/jwmatthews/kyma_poc/data/example_reports/coolstuff-javaee\":\"/opt/output\",\"/var/folders/l1/1b7gzz8n02b5nrtdq51bnj4r0000gn/T/analyze-config-1299214165\":\"/opt/input/config\"}"
INFO[0000] generating analysis log in file               file=/Users/jmatthews/git/jwmatthews/kyma_poc/data/example_reports/coolstuff-javaee/analysis.log
ERRO[0326] container run error                           error="exit status 1"
ERRO[0326] failed to run analysis                        error=
Error: 

real    5m26.242s
user    0m0.102s
sys 0m0.149s

When I look at: /Users/jmatthews/git/jwmatthews/kyma_poc/data/example_reports/coolstuff-javaee/analysis.log I see

time="2023-11-07T14:58:38Z" level=error msg="error writing output file" error="open /opt/output/output.yaml: too many open files" file=/opt/output/output.yaml

I've encountered this in past, see: https://github.com/konveyor/kantra/issues/91#issuecomment-1771600641

I already ran: ulimit -n unlimited

$ ulimit -n 
unlimited

Still seeing the too many open files

Expected Behavior

Analysis should complete

How Reproducible

Always (Default)

Steps To Reproduce

See: https://github.com/jwmatthews/kyma_poc/tree/main/data Can be recreated with the below scripts.

$ cat fetch.sh

pushd .

Starting point

git clone https://github.com/deewhyweb/eap-coolstore-monolith.git coolstuff-javaee

Migrated example for Quarkus

git clone https://github.com/mathianasj/eap-coolstore-monolith.git coolstuff-quarkus cd coolstuff-quarkus git checkout quarkus-migration popd

$ cat darwin_restart_podman_machine.sh

!/bin/sh

Default variables can be overriden from environment

: ${VM_NAME="kantra"} : ${MEM=8192} : ${CPUS=4} : ${DISK_SIZE=100}

See https://github.com/konveyor/kantra/issues/91

See https://github.com/containers/podman/issues/16106#issuecomment-1317188581

ulimit -n unlimited podman machine stop $VM_NAME podman machine rm $VM_NAME -f podman machine init $VM_NAME -v $HOME:$HOME -v /private/tmp:/private/tmp -v /var/folders/:/var/folders/ podman machine set $VM_NAME --cpus $CPUS --memory $MEM --disk-size $DISK_SIZE podman system connection default $VM_NAME podman machine start $VM_NAME

$ cat analyze.sh SOURCE_DIR=coolstuff-javaee OUTDIR=$PWD/example_reports/${SOURCE_DIR} mkdir -p $OUTDIR time ./kantra analyze -i $PWD/$SOURCE_DIR -t "quarkus" -t "jakarta-ee" -t "cloud-readiness" -o $OUTDIR

Environment

- OS: MacOS

Anything else?

No response

jwmatthews commented 10 months ago

Wondering if limit is related to the VM and not MacOS Host.

$ podman machine ssh kantra ulimit -n
1024

I don't seem able to bump the limit -n setting in the VM

 $ podman machine ssh kantra ulimit -n unlimited
bash: line 1: ulimit: open files: cannot modify limit: Operation not permitted
jmontleon commented 10 months ago

If the limit is on the podman machine perhaps the following will help

podman machine ssh kantra "echo *      soft      nofile      65535 | sudo tee -a /etc/security/limits.conf"
podman machine ssh kantra "echo *      hard      nofile      65535 | sudo tee -a /etc/security/limits.conf"
podman machine ssh kantra ulimit -n #To confirm the change has taken effect
jwmatthews commented 10 months ago

Thank you @jmontleon that worked for me, below is how I am running now:

https://github.com/jwmatthews/kyma_poc/blob/main/data/darwin_restart_podman_machine.sh#L19-L23

jwmatthews commented 10 months ago

I think at a min we need to update documentation to share the workaround @jmontleon offered: https://github.com/konveyor/kantra/issues/111#issuecomment-1799289976

shawn-hurley commented 10 months ago

I wonder if something else is happening here, or does it make sense that we are hitting this limit?

jmontleon commented 10 months ago

FWIW I tried to reproduce this on Fedora using podman-remote / podman machine and I could not. It ran in the VM fine. I found that a little surprising. I'm not sure what to read from it though.

jmontleon commented 10 months ago

A quick search brought me to https://github.com/containers/podman/issues/16106

It looks like @jwmatthews has already been here and I see, "The problem is that the virtiofs server provided by qemu is inheriting the ulimit of the user launching it."

I understand now why this is here, but I don't understand why it doesn't fix it if this is the issue. Perhaps it needs a value like 10000 or 65535 vs unlimited is my only guess, but since I can't reproduce it someone with a Mac would have to test the idea. I'm not confident, so maybe not worth the time; besides it looks like someone proposed a potential fix. https://github.com/jwmatthews/kyma_poc/blob/main/data/darwin_restart_podman_machine.sh#L12

jwmatthews commented 10 months ago

@jmontleon I thought there may be a difference with the setting on Host vs VM.

At the moment we have:

jmontleon commented 10 months ago

Yes, I think it was 1024 by default before making a chance.

Unless I'm readiing wrong, looking at the reproducer on the linked issue it seems like it's affected by the ulimit on the host rather than the VM https://github.com/containers/podman/issues/16106#issuecomment-1799576064

But in our case it seems like the VM ulimit is also problematic on Mac for some reason.

jwmatthews commented 10 months ago

Agreed, original issue linked was just for the Mac Host, I originally used that workaround and it seemed to work...then yesterday I tried a larger sample app with more targets and got stuck with the ulimit in VM.

At present it seems like both sides have impacted us, but I have not been able to confirm the host side was indeed a problem. It seemed to be intermittent, so a little leary the problem wasn't there on just the host. I was able to repeatedly see failures with the VM setting at 1024 and then confirm by bumping to 65535 the failures went away. I had to move to a larger app to see these issues appear repetitively.

So right now I think I did hit a limit on the VM side and the the workaround to bump to 65535 helped.