omnivector-solutions / slurm-charms

Repository containing the slurm operator charms
https://omnivector-solutions.github.io/osd-documentation/master/
Apache License 2.0
13 stars 7 forks source link

make charm results in error with "file not found" #112

Closed jamesbeedy closed 2 years ago

jamesbeedy commented 3 years ago

Describe the bug Oftentimes when I build the charms I get an error that tells me that the produced charm file is not available. I need to re-run the make charms command, which sometimes fails again on another file.

When running make charms, after the pack command finishes and the .charm is produced, the built charm is transferred from the lxd container build environment to the local filesystem. We immediately follow the charmcraft pack command with a cp command that copies the built charm file to a new filename <slurm-component>.charm. I have a feeling that the produced .charm artifact file needs a moment to finish transferring from the container before we try to access it.

To Reproduce

Run make charms
  make charms
  echo "charm version: $(cat VERSION)"
  shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.9.6/x64
    LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.6/x64/lib
For a supported user experience, please use the Charmcraft snap. For more information, please see https://snapcraft.io/charmcraft (full execution logs in '/tmp/charmcraft-log-qgrzyv9h')
cp: cannot stat 'slurmd_ubuntu-20.04-amd64_centos-7-amd64.charm': No such file or directory
make: *** [Makefile:32: slurmd] Error 1

Expected behavior I expect the make charms command to successfully create the charms.

Additional context

$ snap info charmcraft | grep installed
installed:          1.1.1                         (513) 51MB classic
heitorPB commented 3 years ago

The first snippet says For a supported user experience, please use the Charmcraft snap. are you sure you are using the new snapcraft from the snap instead of a pip installation?

jamesbeedy commented 3 years ago

Ahh yes, I am on my local setup and I get the same error. I don't think the warning is related.

heitorPB commented 3 years ago

I can't reproduce this on my machine. The same command runs on the CI as well and I can't see any failures on it.

Can you confirm the charmcraft version?

$ charmcraft version
$ which charmcraft
$ snap info charmcraft
jamesbeedy commented 3 years ago

hitting this again

$ make charms
Packing charm 'slurmd_ubuntu-20.04-amd64_centos-7-amd64.charm'...
Created 'slurmd_ubuntu-20.04-amd64_centos-7-amd64.charm'.
Packing charm 'slurmdbd_ubuntu-20.04-amd64_centos-7-amd64.charm'...
Created 'slurmdbd_ubuntu-20.04-amd64_centos-7-amd64.charm'.
Packing charm 'slurmctld_ubuntu-20.04-amd64_centos-7-amd64.charm'...
charmcraft internal error! BaseConfigurationError: Failed to setup snapd.
* Command that failed: 'lxc --project charmcraft exec local:charmcraft-slurmctld-16777867-0-0-amd64 -- env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin CHARMCRAFT_MANAGED_MODE=1 systemctl restart snapd.service'
* Command exit code: 1
* Command standard error output: b'Job for snapd.service canceled.\n' (full execution logs in /home/bdx/snap/charmcraft/common/charmcraft-log-r3pk10zj)
cp: cannot stat 'slurmctld_ubuntu-20.04-amd64_centos-7-amd64.charm': No such file or directory
make: *** [Makefile:43: slurmctld] Error 1
$ charmcraft version
1.2+26.g81a8719
bdx@raton00:~/allcode/github/omnivector/slurm-charms$ which charmcraft
/snap/bin/charmcraft
bdx@raton00:~/allcode/github/omnivector/slurm-charms$ snap info charmcraft
name:      charmcraft
summary:   The charming tool
publisher: Canonical✓
store-url: https://snapcraft.io/charmcraft
license:   Apache-2.0
description: |
  Charmcraft enables charm creators to build, publish, and manage charmed operators for Kubernetes,
  metal and virtual machines.
commands:
  - charmcraft
snap-id:      gcqfpVCOUvmDuYT0Dh5PjdeGypSEzNdV
tracking:     latest/edge
refresh-date: yesterday at 08:45 UTC
channels:
  latest/stable:    1.1.1           2021-07-15 (513) 51MB classic
  latest/candidate: 1.2.0           2021-08-03 (573) 55MB classic
  latest/beta:      ↑
  latest/edge:      1.2+26.g81a8719 2021-08-11 (595) 55MB classic
installed:          1.2+26.g81a8719            (595) 55MB classic

I think we need a sleep in there....

heitorPB commented 3 years ago

charmcraft internal error looks like something outside of our code.

heitorPB commented 3 years ago

This is in part due to https://github.com/canonical/charmcraft/issues/478, an internal Charmcraft error is returning 0. This way, make thinks Charmcraft succeeded and continues to the next task.

mmrezaie commented 2 years ago

I do have the same issue. Is this problem having a solution?

jamesbeedy commented 2 years ago

I think it may be as simple as adding a sleep 1 after calling ‘charmcraft build’ in each of the slurm charm component build commands in the makefile.

heitorPB commented 2 years ago

This is a charmcraft bug: if an internat command fails, charmcraft crashes but exits with 0, the charm is never built but the Makefile does not know that and proceeds to the next command.

Adding a sleep 1 would not fix it.

heitorPB commented 2 years ago

Is this problem still hitting you @mmrezaie?

heitorPB commented 2 years ago

Closing this as it was fixed in recent versions of Charmcraft.