kaizen-ai / kaizenflow

KaizenFlow is a framework for Bayesian reasoning and AI/ML stream computing
GNU General Public License v3.0
109 stars 76 forks source link

On-board Sameep Pote #544

Closed DanilYachmenev closed 1 year ago

DanilYachmenev commented 1 year ago

Please follow this checklist. Post any errors you face in this issue.

FYI @gitpaulsmith @gpsaggese @samarth9008

DanilYachmenev commented 1 year ago

Will assign after @Sameep2808 is added to the repo

Sameep2808 commented 1 year ago

@DanilYachmenev Im facing issue in running docker_bash. Seems like its not able to recognize host and hence its not giving me permission. Can you help me with this.

(amp.client_venv) sameep@sameep-ROG-Zephyrus-M16-GU603HM-GU603HM:~/src/sorrentum1$ i docker_bash
One and only one set-up config should be true:
is_cmamp_prod=False
is_dev4=False
is_dev_ck=False
is_ig_prod=False
is_inside_ci=False
is_mac=False
INFO: > cmd='/home/sameep/src/venv/amp.client_venv/bin/invoke docker_bash'
## docker_bash: 
20:39:59 - WARN  <string> _raise_invalid_host:89                        Don't recognize host: host_os_name=Linux, am_host_os_name=None
20:39:59 - WARN  <string> _raise_invalid_host:89                        Don't recognize host: host_os_name=Linux, am_host_os_name=None
20:39:59 - WARN  <string> _raise_invalid_host:89                        Don't recognize host: host_os_name=Linux, am_host_os_name=None
20:39:59 - INFO  lib_tasks_docker.py _docker_cmd:1246                   Pulling the latest version of Docker
## docker_pull: 
## docker_login: 
  ... 
  ... The config profile (ck) could not be found
20:39:59 - INFO  lib_tasks_docker.py _docker_pull:226                   image='sorrentum/cmamp:dev'
docker pull sorrentum/cmamp:dev
dev: Pulling from sorrentum/cmamp
Digest: sha256:c5712b99f12e7e71b01c852727669226856f3c527ac766f00bd4a275e1439931
Status: Image is up to date for sorrentum/cmamp:dev
docker.io/sorrentum/cmamp:dev

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview sorrentum/cmamp:dev
IMAGE=sorrentum/cmamp:dev \
        docker-compose \
        --file /home/sameep/src/sorrentum1/devops/compose/docker-compose.yml \
        --env-file devops/env/default.env \
        run \
        --rm \
        --name sameep.cmamp.app.sorrentum1.20230817_203959 \
        --user $(id -u):$(id -g) \
        app \
        bash 
WARNING: The AM_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The AM_FORCE_TEST_FAIL variable is not set. Defaulting to a blank string.
WARNING: The AM_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The CK_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
Creating compose_app_run ... done
##> devops/docker_run/entrypoint.sh
UID=1000
GID=1000
# Activate environment
##> devops/docker_run/setenv.sh
# Set PATH
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Set PYTHONPATH
PYTHONPATH=/app:
# Configure env
git --version: git version 2.25.1
/app
WARNING: AWS credential check failed: can't find /home/.aws/credentials file.
WARNING: AWS credential check failed: can't find /home/.aws/config file.
# Check AWS authentication setup
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                       am           manual    --profile

The config profile (am) could not be found
AM_CONTAINER_VERSION='1.5.0'
which python: /venv/bin/python
python -V: Python 3.8.10
helpers: <module 'helpers' from '/app/helpers/__init__.py'>
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH=/app:
entrypoint.sh: 'bash'
One and only one set-up config should be true:
is_cmamp_prod=False
is_dev4=False
is_dev_ck=False
is_ig_prod=False
is_inside_ci=False
is_mac=False
Don't recognize host: host_os_name=Linux, am_host_os_name=Linux
Traceback (most recent call last):
  File "/venv/bin/invoke", line 8, in <module>
    sys.exit(program.run())
  File "/venv/lib/python3.8/site-packages/invoke/program.py", line 373, in run
    self.parse_collection()
  File "/venv/lib/python3.8/site-packages/invoke/program.py", line 465, in parse_collection
    self.load_collection()
  File "/venv/lib/python3.8/site-packages/invoke/program.py", line 699, in load_collection
    module, parent = loader.load(coll_name)
  File "/venv/lib/python3.8/site-packages/invoke/loader.py", line 76, in load
    module = imp.load_module(name, fd, path, desc)
  File "/usr/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/app/tasks.py", line 106, in <module>
    from oms.lib_tasks_binance import (  # isort: skip # noqa: F401  # pylint: disable=unused-import
  File "/app/oms/__init__.py", line 7, in <module>
    from oms.broker import *  # pylint: disable=unused-import # NOQA
  File "/app/oms/broker.py", line 23, in <module>
    import market_data as mdata
  File "/app/market_data/__init__.py", line 8, in <module>
    from market_data.im_client_market_data import *  # pylint: disable=unused-import # NOQA
  File "/app/market_data/im_client_market_data.py", line 16, in <module>
    import im_v2.common.data.client as icdc
  File "/app/im_v2/common/data/client/__init__.py", line 13, in <module>
    from im_v2.common.data.client.im_raw_data_client import *  # pylint: disable=unused-import # NOQA
  File "/app/im_v2/common/data/client/im_raw_data_client.py", line 21, in <module>
    import im_v2.common.db.db_utils as imvcddbut
  File "/app/im_v2/common/db/db_utils.py", line 24, in <module>
    import im.kibot.sql_writer as imkisqwri
  File "/app/im/kibot/__init__.py", line 5, in <module>
    from im.kibot.data.load.kibot_s3_data_loader import *  # pylint: disable=unused-import # NOQA
  File "/app/im/kibot/data/load/kibot_s3_data_loader.py", line 23, in <module>
    class KibotS3DataLoader(imcdladalo.AbstractS3DataLoader):
  File "/app/im/kibot/data/load/kibot_s3_data_loader.py", line 59, in KibotS3DataLoader
    def _read_csv(
  File "/app/helpers/hcache.py", line 1044, in wrapper
    return _Cached(
  File "/app/helpers/hcache.py", line 416, in __init__
    ) = self._create_function_disk_cache()
  File "/app/helpers/hcache.py", line 739, in _create_function_disk_cache
    disk_cache = get_global_cache(cache_type, self._tag)
  File "/app/helpers/hcache.py", line 239, in get_global_cache
    _DISK_CACHE = _create_global_cache_backend(cache_type)
  File "/app/helpers/hcache.py", line 214, in _create_global_cache_backend
    cache_backend = joblib.Memory(dir_name, verbose=0, compress=True)
  File "/venv/lib/python3.8/site-packages/joblib/memory.py", line 932, in __init__
    self.store_backend = _store_backend_factory(
  File "/venv/lib/python3.8/site-packages/joblib/memory.py", line 128, in _store_backend_factory
    obj.configure(location, verbose=verbose,
  File "/venv/lib/python3.8/site-packages/joblib/_store_backends.py", line 400, in configure
    mkdirp(self.location)
  File "/venv/lib/python3.8/site-packages/joblib/disk.py", line 61, in mkdirp
    os.makedirs(d)
  File "/usr/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/app/tmp.cache.disk'
Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
Destroying '/mnt/tmpfs/tmp.cache.mem' ...
ERROR: 1

Sorry for the delay in the task I was facing issues with Ubuntu 22.04 wifi drivers getting properly installed on my laptop finally everything is working and I have started working on the main tasks

gpsaggese commented 1 year ago

I've seen this error before and I think it's a problem with Ubuntu. It looks like /app/tmp can't be written. Is it related to a user remapping or sudo? @samarth9008 @PomazkinG @jsmerix any idea?

DanilYachmenev commented 1 year ago

Meanwhile your warm-up issue is https://github.com/sorrentum/sorrentum/issues/468

Sameep2808 commented 1 year ago

@DanilYachmenev @gpsaggese @samarth9008 I tried working on this issue yesterday first I thought it was a issue with my Ubuntu installation or system so I tried it on completely new installations of Ubuntu 22.04 and 20.04 on two different laptops but I am still getting the same error on all the devices can someone guide me with this.

Also I found out that this > source dev_scripts/client_setup/build.sh was crashing the terminal on freshly installed Ubuntu system. Found out that there are few dependencies that needed to be installed that are not mentioned in the document or included in the bash. Inclusion of these will be helpful for Ubuntu users in future.

sudo apt install python3.8-venv
python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv

@DanilYachmenev Im facing issue in running docker_bash. Seems like its not able to recognize host and hence its not giving me permission. Can you help me with this.

(amp.client_venv) sameep@sameep-ROG-Zephyrus-M16-GU603HM-GU603HM:~/src/sorrentum1$ i docker_bash
One and only one set-up config should be true:
is_cmamp_prod=False
is_dev4=False
is_dev_ck=False
is_ig_prod=False
is_inside_ci=False
is_mac=False
INFO: > cmd='/home/sameep/src/venv/amp.client_venv/bin/invoke docker_bash'
## docker_bash: 
20:39:59 - WARN  <string> _raise_invalid_host:89                        Don't recognize host: host_os_name=Linux, am_host_os_name=None
20:39:59 - WARN  <string> _raise_invalid_host:89                        Don't recognize host: host_os_name=Linux, am_host_os_name=None
20:39:59 - WARN  <string> _raise_invalid_host:89                        Don't recognize host: host_os_name=Linux, am_host_os_name=None
20:39:59 - INFO  lib_tasks_docker.py _docker_cmd:1246                   Pulling the latest version of Docker
## docker_pull: 
## docker_login: 
  ... 
  ... The config profile (ck) could not be found
20:39:59 - INFO  lib_tasks_docker.py _docker_pull:226                   image='sorrentum/cmamp:dev'
docker pull sorrentum/cmamp:dev
dev: Pulling from sorrentum/cmamp
Digest: sha256:c5712b99f12e7e71b01c852727669226856f3c527ac766f00bd4a275e1439931
Status: Image is up to date for sorrentum/cmamp:dev
docker.io/sorrentum/cmamp:dev

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview sorrentum/cmamp:dev
IMAGE=sorrentum/cmamp:dev \
        docker-compose \
        --file /home/sameep/src/sorrentum1/devops/compose/docker-compose.yml \
        --env-file devops/env/default.env \
        run \
        --rm \
        --name sameep.cmamp.app.sorrentum1.20230817_203959 \
        --user $(id -u):$(id -g) \
        app \
        bash 
WARNING: The AM_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The AM_FORCE_TEST_FAIL variable is not set. Defaulting to a blank string.
WARNING: The AM_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The CK_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
Creating compose_app_run ... done
##> devops/docker_run/entrypoint.sh
UID=1000
GID=1000
# Activate environment
##> devops/docker_run/setenv.sh
# Set PATH
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Set PYTHONPATH
PYTHONPATH=/app:
# Configure env
git --version: git version 2.25.1
/app
WARNING: AWS credential check failed: can't find /home/.aws/credentials file.
WARNING: AWS credential check failed: can't find /home/.aws/config file.
# Check AWS authentication setup
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                       am           manual    --profile

The config profile (am) could not be found
AM_CONTAINER_VERSION='1.5.0'
which python: /venv/bin/python
python -V: Python 3.8.10
helpers: <module 'helpers' from '/app/helpers/__init__.py'>
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH=/app:
entrypoint.sh: 'bash'
One and only one set-up config should be true:
is_cmamp_prod=False
is_dev4=False
is_dev_ck=False
is_ig_prod=False
is_inside_ci=False
is_mac=False
Don't recognize host: host_os_name=Linux, am_host_os_name=Linux
Traceback (most recent call last):
  File "/venv/bin/invoke", line 8, in <module>
    sys.exit(program.run())
  File "/venv/lib/python3.8/site-packages/invoke/program.py", line 373, in run
    self.parse_collection()
  File "/venv/lib/python3.8/site-packages/invoke/program.py", line 465, in parse_collection
    self.load_collection()
  File "/venv/lib/python3.8/site-packages/invoke/program.py", line 699, in load_collection
    module, parent = loader.load(coll_name)
  File "/venv/lib/python3.8/site-packages/invoke/loader.py", line 76, in load
    module = imp.load_module(name, fd, path, desc)
  File "/usr/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/app/tasks.py", line 106, in <module>
    from oms.lib_tasks_binance import (  # isort: skip # noqa: F401  # pylint: disable=unused-import
  File "/app/oms/__init__.py", line 7, in <module>
    from oms.broker import *  # pylint: disable=unused-import # NOQA
  File "/app/oms/broker.py", line 23, in <module>
    import market_data as mdata
  File "/app/market_data/__init__.py", line 8, in <module>
    from market_data.im_client_market_data import *  # pylint: disable=unused-import # NOQA
  File "/app/market_data/im_client_market_data.py", line 16, in <module>
    import im_v2.common.data.client as icdc
  File "/app/im_v2/common/data/client/__init__.py", line 13, in <module>
    from im_v2.common.data.client.im_raw_data_client import *  # pylint: disable=unused-import # NOQA
  File "/app/im_v2/common/data/client/im_raw_data_client.py", line 21, in <module>
    import im_v2.common.db.db_utils as imvcddbut
  File "/app/im_v2/common/db/db_utils.py", line 24, in <module>
    import im.kibot.sql_writer as imkisqwri
  File "/app/im/kibot/__init__.py", line 5, in <module>
    from im.kibot.data.load.kibot_s3_data_loader import *  # pylint: disable=unused-import # NOQA
  File "/app/im/kibot/data/load/kibot_s3_data_loader.py", line 23, in <module>
    class KibotS3DataLoader(imcdladalo.AbstractS3DataLoader):
  File "/app/im/kibot/data/load/kibot_s3_data_loader.py", line 59, in KibotS3DataLoader
    def _read_csv(
  File "/app/helpers/hcache.py", line 1044, in wrapper
    return _Cached(
  File "/app/helpers/hcache.py", line 416, in __init__
    ) = self._create_function_disk_cache()
  File "/app/helpers/hcache.py", line 739, in _create_function_disk_cache
    disk_cache = get_global_cache(cache_type, self._tag)
  File "/app/helpers/hcache.py", line 239, in get_global_cache
    _DISK_CACHE = _create_global_cache_backend(cache_type)
  File "/app/helpers/hcache.py", line 214, in _create_global_cache_backend
    cache_backend = joblib.Memory(dir_name, verbose=0, compress=True)
  File "/venv/lib/python3.8/site-packages/joblib/memory.py", line 932, in __init__
    self.store_backend = _store_backend_factory(
  File "/venv/lib/python3.8/site-packages/joblib/memory.py", line 128, in _store_backend_factory
    obj.configure(location, verbose=verbose,
  File "/venv/lib/python3.8/site-packages/joblib/_store_backends.py", line 400, in configure
    mkdirp(self.location)
  File "/venv/lib/python3.8/site-packages/joblib/disk.py", line 61, in mkdirp
    os.makedirs(d)
  File "/usr/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/app/tmp.cache.disk'
Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
Destroying '/mnt/tmpfs/tmp.cache.mem' ...
ERROR: 1

Sorry for the delay in the task I was facing issues with Ubuntu 22.04 wifi drivers getting properly installed on my laptop finally everything is working and I have started working on the main tasks

Sameep2808 commented 1 year ago

Meanwhile your warm-up issue is #468

@DanilYachmenev @gpsaggese Im almost done with the current warmup task and ready for next

gpsaggese commented 1 year ago

1) Did you find the solution to the permission issue? 2) @samarth9008 @jsmerix do we support Ubuntu 20 or 22 on the dev servers? We should suggest people to use the same version we use on the dev servers

3) What exactly was the error you got with

Also I found out that this > source dev_scripts/client_setup/build.sh was crashing the terminal on freshly installed Ubuntu system. Found out that there are few dependencies that needed to be installed that are not mentioned in the document or included in the bash. 

?

4) We use Python3.8 inside the Docker container although I'm a bit surprised that there is a dep from outside the container. In any case, feel free to add these suggestions to the doc in a PR and improve the docs based on your experience

sudo apt install python3.8-venv
python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv
Sameep2808 commented 1 year ago

@gpsaggese

  1. No I tried but was not able to, so I just used Virtual Machine on Windows will still have to check if want to continue in Ubuntu

  2. As we first make a virtual environment before pulling docker using python we would need these dependencies which helps in creating the virtual environment. Not having this dependencies just crashed the terminal when the bash was sourced. Then I ran the bash line one by one in the terminal and found out it was crashing at python3 -m venv $VENV_DIR as of lack of the venv dependency. As well as found out even adding pyyaml is required for properly sourcing the setup.bash as raised in issue #437 but is not yet pushed to the master

  3. I have explained the reason above and if everyone agrees Ill add it to the docs

jsmerix commented 1 year ago

@jsmerix do we support Ubuntu 20 or 22 on the dev servers? We should suggest people to use the same version we use on the dev servers

(amp.client_venv) jsmeriga@dev1:~/src$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
Sameep2808 commented 1 year ago

@jsmerix Im using the same Ubuntu version but still getting permission denied error. Do let me know if there are any other user configurations need to be done.

(amp.client_venv) sameep@sameep-G7-7500:~/src/sorrentum1$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
gpsaggese commented 1 year ago

I think it's an issue with user id. @jsmerix wdyt?

jsmerix commented 1 year ago

@Sameep2808 can you please run ls -ld ~/src/sorrentum1 and show us the output?

samarth9008 commented 1 year ago

cat /etc/os-release

I think it's an issue with user id. @jsmerix wdyt?

I don't think the issue is with userId. My output is similar

(amp.client_venv) sorrentum@samarthkp:~/src/sorrentum1$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Sameep2808 commented 1 year ago

@Sameep2808 can you please run ls -ld ~/src/sorrentum1 and show us the output?

@DanilYachmenev @gpsaggese @jsmerix @samarth9008 Hey the i docker_bash has started working for me after I followed this tutorial after a fresh installation again yesterday.

But still Im getting lots of failed tests in i run_fast_tests due to permission denied error. Im still looking into the error but do let me know if you guys have seen this before. I have pasted the output Im getting down here

FAILED core/config/test/test_config_utils.py::Test_make_hashable::test4 - Per...
FAILED core/config/test/test_config_utils.py::Test_make_hashable::test5 - Per...
FAILED core/config/test/test_config_utils.py::Test_make_hashable::test6 - Per...
= 842 failed, 1202 passed, 229 skipped, 378 deselected, 2 xfailed in 110.26s (0:01:50) =
12:54:44 @ 2023-08-21 08:53:53 - INFO  hcache.py clear_global_cache:293 Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=8.0 KB
12:54:44 @ 2023-08-21 08:53:53 - WARN  hcache.py clear_global_cache:294 Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
12:54:44 @ 2023-08-21 08:53:53 - WARN  hcache.py clear_global_cache:304 Destroying '/mnt/tmpfs/tmp.cache.mem' ...
12:54:44 @ 2023-08-21 08:53:53 - INFO  hcache.py clear_global_cache:320 After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
ERROR: 1
___________________________ Test_make_hashable.test6 ___________________________
Traceback (most recent call last):
  File "/app/core/config/test/test_config_utils.py", line 483, in test6
    self.helper(obj, is_hashable, expected)
  File "/app/core/config/test/test_config_utils.py", line 400, in helper
    self.assert_equal(actual, expected, fuzzy_match=True)
  File "/app/helpers/hunit_test.py", line 1315, in assert_equal
    hio.create_dir(dir_name, incremental=True)
  File "/app/helpers/hio.py", line 284, in create_dir
    _create_dir(
  File "/app/helpers/hio.py", line 354, in _create_dir
    raise e
  File "/app/helpers/hio.py", line 345, in _create_dir
    os.makedirs(dir_name)
  File "/usr/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/app/core/config/test/outcomes/Test_make_hashable.test6'

So this is the output Im getting. drwxrwxrwx 32 sameep sameep 4096 Aug 20 12:13 /home/sameep/src/sorrentum1

gpsaggese commented 1 year ago

Thinking around the issue.

Most of the failures are related to the weird /tmp issue. It looks like different versions of Linux handle the basic perms for different system dirs differently. Maybe we should add something like sudo chmod 777 /tmp in devops/docker_build/create_users.sh

@Sameep2808 can you do a sudo chmod 777 /tmp after i docker_bash and see if the error in the test goes away?

Sameep2808 commented 1 year ago

@Sameep2808 can you do a sudo chmod 777 /tmp after i docker_bash and see if the error in the test goes away?

@gpsaggese No it didn't work its the fast tests are still giving the same error


FAILED core/config/test/test_config_utils.py::Test_make_hashable::test4 - Per...
FAILED core/config/test/test_config_utils.py::Test_make_hashable::test5 - Per...
FAILED core/config/test/test_config_utils.py::Test_make_hashable::test6 - Per...
= 842 failed, 1202 passed, 229 skipped, 378 deselected, 2 xfailed in 80.66s (0:01:20) =
23:09:26 @ 2023-08-21 07:08:48 - INFO  hcache.py clear_global_cache:293 Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=8.0 KB
23:09:26 @ 2023-08-21 07:08:48 - WARN  hcache.py clear_global_cache:294 Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
23:09:26 @ 2023-08-21 07:08:48 - WARN  hcache.py clear_global_cache:304 Destroying '/mnt/tmpfs/tmp.cache.mem' ...
23:09:26 @ 2023-08-21 07:08:48 - INFO  hcache.py clear_global_cache:320 After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
ERROR: 1
gpsaggese commented 1 year ago

The previous post didn't help. We need to open the black box to debug.

1) i docker bash. Run only 1 failing test. Report the error 2) i docker bash. Apply the fix. Run the same failing test. Report the error

Sameep2808 commented 1 year ago

@gpsaggese It seems like if the permissions are 777 for every single directory inside as well then it works. Ill try to use recursive chmod and check for remaining test cases

Before

user_1000@a9e9193ba6c3:/app$ pytest oms/test/test_api.py::Test_Position1::test_diff1
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /venv/bin/python
cachedir: .pytest_cache
rootdir: /app, configfile: pytest.ini
plugins: instafail-0.4.2, cov-4.0.0, xdist-3.0.2, rerunfailures-10.2, timeout-2.1.0, anyio-3.6.2
collecting 1 item                                                              -----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.5.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Git
  branch_name='master'
  hash='fb1e591e1'
  # Last commits:
    * fb1e591e1 zli00185 SorrTask 481 Unit test plot_cols() (#543)                         (    4 days ago) Thu Aug 17 20:32:26 2023  (HEAD -> master, origin/master, origin/HEAD)
    * 8536babe6 4nonymous Sorr issue428 tools profiling gdoc to md (#452)                   (    7 days ago) Tue Aug 15 07:29:37 2023  (origin/SorrTask_542_make_convert_to_multiindex_public_and_move_to_a_general_lib)
    * 8add58e09 Yiyun Lei Sorr Task496 refine google sheets flow (#525)                     (    7 days ago) Mon Aug 14 18:58:23 2023           
# Machine info
  system=Linux
  node name=a9e9193ba6c3
  release=5.15.49-linuxkit-pr
  version=#1 SMP Thu May 25 07:17:40 UTC 2023
  machine=x86_64
  processor=x86_64
  cpu count=6
  cpu freq=scpufreq(current=2592.0, min=0.0, max=0.0)
  memory=svmem(total=8106422272, available=6692921344, percent=17.4, used=782004224, free=3452977152, active=537128960, inactive=3749044224, buffers=74117120, cached=3797323776, shared=315658240, slab=253157376)
  disk usage=sdiskusage(total=67317051392, used=9828225024, free=54036127744, percent=15.4)
# Packages
  python: 3.8.10
  cvxopt: 1.3.1
  cvxpy: 1.3.2
  gluonnlp: ?
  gluonts: 0.6.7
  joblib: 1.2.0
  mxnet: 1.9.1
  numpy: 1.23.4
  pandas: 1.5.1
  pyarrow: 10.0.0
  scipy: 1.9.3
  seaborn: 0.12.1
  sklearn: 1.1.3
  statsmodels: 0.13.5
INFO: > cmd='/venv/bin/pytest oms/test/test_api.py::Test_Position1::test_diff1'
INFO: Saving log to file 'tmp.pytest.log'
collected 1 item                                                               

oms/test/test_api.py::Test_Position1::test_diff1 (0.00 s) FAILED                  [100%]

=================================== FAILURES ===================================
__________________________ Test_Position1.test_diff1 ___________________________
Traceback (most recent call last):
  File "/app/oms/test/test_api.py", line 134, in test_diff1
    self.assert_equal(act, exp)
  File "/app/helpers/hunit_test.py", line 1315, in assert_equal
    hio.create_dir(dir_name, incremental=True)
  File "/app/helpers/hio.py", line 284, in create_dir
    _create_dir(
  File "/app/helpers/hio.py", line 354, in _create_dir
    raise e
  File "/app/helpers/hio.py", line 345, in _create_dir
    os.makedirs(dir_name)
  File "/usr/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/app/oms/test/outcomes/Test_Position1.test_diff1'
============================= slowest 3 durations ==============================
0.01s call     oms/test/test_api.py::Test_Position1::test_diff1
0.00s setup    oms/test/test_api.py::Test_Position1::test_diff1
0.00s teardown oms/test/test_api.py::Test_Position1::test_diff1
=========================== short test summary info ============================
FAILED oms/test/test_api.py::Test_Position1::test_diff1 - PermissionError: [E...
============================== 1 failed in 4.13s ===============================
23:18:09 - INFO  hcache.py clear_global_cache:293                       Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=32.0 KB
23:18:09 - WARN  hcache.py clear_global_cache:294                       Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
23:18:09 - WARN  hcache.py clear_global_cache:304                       Destroying '/mnt/tmpfs/tmp.cache.mem' ...
23:18:09 - INFO  hcache.py clear_global_cache:320                       After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan

After 777 just osm/ directory

user_1000@a9e9193ba6c3:/app$ sudo chmod 777 oms/
user_1000@a9e9193ba6c3:/app$ pytest oms/test/test_api.py::Test_Position1::test_diff1
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /venv/bin/python
cachedir: .pytest_cache
rootdir: /app, configfile: pytest.ini
plugins: instafail-0.4.2, cov-4.0.0, xdist-3.0.2, rerunfailures-10.2, timeout-2.1.0, anyio-3.6.2
collecting 1 item                                                              -----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.5.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Git
  branch_name='master'
  hash='fb1e591e1'
  # Last commits:
    * fb1e591e1 zli00185 SorrTask 481 Unit test plot_cols() (#543)                         (    4 days ago) Thu Aug 17 20:32:26 2023  (HEAD -> master, origin/master, origin/HEAD)
    * 8536babe6 4nonymous Sorr issue428 tools profiling gdoc to md (#452)                   (    7 days ago) Tue Aug 15 07:29:37 2023  (origin/SorrTask_542_make_convert_to_multiindex_public_and_move_to_a_general_lib)
    * 8add58e09 Yiyun Lei Sorr Task496 refine google sheets flow (#525)                     (    7 days ago) Mon Aug 14 18:58:23 2023           
# Machine info
  system=Linux
  node name=a9e9193ba6c3
  release=5.15.49-linuxkit-pr
  version=#1 SMP Thu May 25 07:17:40 UTC 2023
  machine=x86_64
  processor=x86_64
  cpu count=6
  cpu freq=scpufreq(current=2592.0, min=0.0, max=0.0)
  memory=svmem(total=8106422272, available=6691856384, percent=17.4, used=783069184, free=3451842560, active=537354240, inactive=3749322752, buffers=74153984, cached=3797356544, shared=315658240, slab=252780544)
  disk usage=sdiskusage(total=67317051392, used=9828237312, free=54036115456, percent=15.4)
# Packages
  python: 3.8.10
  cvxopt: 1.3.1
  cvxpy: 1.3.2
  gluonnlp: ?
  gluonts: 0.6.7
  joblib: 1.2.0
  mxnet: 1.9.1
  numpy: 1.23.4
  pandas: 1.5.1
  pyarrow: 10.0.0
  scipy: 1.9.3
  seaborn: 0.12.1
  sklearn: 1.1.3
  statsmodels: 0.13.5
INFO: > cmd='/venv/bin/pytest oms/test/test_api.py::Test_Position1::test_diff1'
INFO: Saving log to file 'tmp.pytest.log'
collected 1 item                                                               

oms/test/test_api.py::Test_Position1::test_diff1 (0.00 s) FAILED                  [100%]

=================================== FAILURES ===================================
__________________________ Test_Position1.test_diff1 ___________________________
Traceback (most recent call last):
  File "/app/oms/test/test_api.py", line 134, in test_diff1
    self.assert_equal(act, exp)
  File "/app/helpers/hunit_test.py", line 1315, in assert_equal
    hio.create_dir(dir_name, incremental=True)
  File "/app/helpers/hio.py", line 284, in create_dir
    _create_dir(
  File "/app/helpers/hio.py", line 354, in _create_dir
    raise e
  File "/app/helpers/hio.py", line 345, in _create_dir
    os.makedirs(dir_name)
  File "/usr/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/app/oms/test/outcomes/Test_Position1.test_diff1'
============================= slowest 3 durations ==============================
0.01s call     oms/test/test_api.py::Test_Position1::test_diff1
0.00s setup    oms/test/test_api.py::Test_Position1::test_diff1
0.00s teardown oms/test/test_api.py::Test_Position1::test_diff1
=========================== short test summary info ============================
FAILED oms/test/test_api.py::Test_Position1::test_diff1 - PermissionError: [E...
============================== 1 failed in 4.12s ===============================
23:19:08 - INFO  hcache.py clear_global_cache:293                       Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=32.0 KB
23:19:08 - WARN  hcache.py clear_global_cache:294                       Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
23:19:08 - WARN  hcache.py clear_global_cache:304                       Destroying '/mnt/tmpfs/tmp.cache.mem' ...
23:19:08 - INFO  hcache.py clear_global_cache:320                       After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan

777 all the inner files as well

user_1000@a9e9193ba6c3:/app$ sudo chmod 777 oms/test/outcomes                  
user_1000@a9e9193ba6c3:/app$ pytest oms/test/test_api.py::Test_Position1::test_diff1
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /venv/bin/python
cachedir: .pytest_cache
rootdir: /app, configfile: pytest.ini
plugins: instafail-0.4.2, cov-4.0.0, xdist-3.0.2, rerunfailures-10.2, timeout-2.1.0, anyio-3.6.2
collecting 1 item                                                              -----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.5.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Git
  branch_name='master'
  hash='fb1e591e1'
  # Last commits:
    * fb1e591e1 zli00185 SorrTask 481 Unit test plot_cols() (#543)                         (    4 days ago) Thu Aug 17 20:32:26 2023  (HEAD -> master, origin/master, origin/HEAD)
    * 8536babe6 4nonymous Sorr issue428 tools profiling gdoc to md (#452)                   (    7 days ago) Tue Aug 15 07:29:37 2023  (origin/SorrTask_542_make_convert_to_multiindex_public_and_move_to_a_general_lib)
    * 8add58e09 Yiyun Lei Sorr Task496 refine google sheets flow (#525)                     (    7 days ago) Mon Aug 14 18:58:23 2023           
# Machine info
  system=Linux
  node name=a9e9193ba6c3
  release=5.15.49-linuxkit-pr
  version=#1 SMP Thu May 25 07:17:40 UTC 2023
  machine=x86_64
  processor=x86_64
  cpu count=6
  cpu freq=scpufreq(current=2592.0, min=0.0, max=0.0)
  memory=svmem(total=8106422272, available=6706778112, percent=17.3, used=768143360, free=3466563584, active=539099136, inactive=3738849280, buffers=74211328, cached=3797504000, shared=315662336, slab=252792832)
  disk usage=sdiskusage(total=67317051392, used=9828253696, free=54036099072, percent=15.4)
# Packages
  python: 3.8.10
  cvxopt: 1.3.1
  cvxpy: 1.3.2
  gluonnlp: ?
  gluonts: 0.6.7
  joblib: 1.2.0
  mxnet: 1.9.1
  numpy: 1.23.4
  pandas: 1.5.1
  pyarrow: 10.0.0
  scipy: 1.9.3
  seaborn: 0.12.1
  sklearn: 1.1.3
  statsmodels: 0.13.5
INFO: > cmd='/venv/bin/pytest oms/test/test_api.py::Test_Position1::test_diff1'
INFO: Saving log to file 'tmp.pytest.log'
collected 1 item                                                               

oms/test/test_api.py::Test_Position1::test_diff1 (0.00 s) PASSED                  [100%]

============================= slowest 3 durations ==============================
0.01s call     oms/test/test_api.py::Test_Position1::test_diff1
0.00s setup    oms/test/test_api.py::Test_Position1::test_diff1
0.00s teardown oms/test/test_api.py::Test_Position1::test_diff1
============================== 1 passed in 4.09s ===============================
23:20:13 - INFO  hcache.py clear_global_cache:293                       Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=32.0 KB
23:20:13 - WARN  hcache.py clear_global_cache:294                       Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
23:20:13 - WARN  hcache.py clear_global_cache:304                       Destroying '/mnt/tmpfs/tmp.cache.mem' ...
23:20:13 - INFO  hcache.py clear_global_cache:320                       After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
user_1000@a9e9193ba6c3:/app$ 

Thinking around the issue.

Most of the failures are related to the weird /tmp issue. It looks like different versions of Linux handle the basic perms for different system dirs differently. Maybe we should add something like sudo chmod 777 /tmp in devops/docker_build/create_users.sh

@Sameep2808 can you do a sudo chmod 777 /tmp after i docker_bash and see if the error in the test goes away?

Sameep2808 commented 1 year ago

@gpsaggese The recursive file permissions worked sudo chmod -R 777 / now almost all the tests passes except 2 Im looking into it.

run_fast_tests Output:

FAILED helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_class_outcomes
FAILED helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_method_outcomes
= 2 failed, 2042 passed, 229 skipped, 378 deselected, 2 xfailed in 86.15s (0:01:26) =
23:51:37 @ 2023-08-21 07:50:56 - INFO  hcache.py clear_global_cache:293 Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=28.0 KB
23:51:37 @ 2023-08-21 07:50:56 - WARN  hcache.py clear_global_cache:294 Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
23:51:37 @ 2023-08-21 07:50:56 - WARN  hcache.py clear_global_cache:304 Destroying '/mnt/tmpfs/tmp.cache.mem' ...
23:51:37 @ 2023-08-21 07:50:56 - INFO  hcache.py clear_global_cache:320 After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
ERROR: 1

Output of the two failed tests pytest helpers/test/test_hunit_test_utils.py::TestytestRenameOutcomes


=================================== FAILURES ===================================
_____________ TestPytestRenameOutcomes.test_rename_class_outcomes ______________
Traceback (most recent call last):
  File "/app/helpers/test/test_hunit_test_utils.py", line 197, in test_rename_class_outcomes
    renamer.rename_outcomes(
  File "/app/helpers/hunit_test_utils.py", line 85, in rename_outcomes
    renamed = self._process_outcomes_dir(outcome_dir, outcomes_path)
  File "/app/helpers/hunit_test_utils.py", line 357, in _process_outcomes_dir
    self._rename_directory(outcome_path_old, outcome_path_new)
  File "/app/helpers/hunit_test_utils.py", line 188, in _rename_directory
    rc = hsystem.system(cmd, abort_on_error=True, suppress_output=False)
  File "/app/helpers/hsystem.py", line 303, in system
    rc, _ = _system(
  File "/app/helpers/hsystem.py", line 277, in _system
    raise RuntimeError(
RuntimeError: cmd='(mv toyCmTask1279.test_rename_class_outcomes/test/outcomes/TestCase.test_check_string1 toyCmTask1279.test_rename_class_outcomes/test/outcomes/TestRenamedCase.test_check_string1) 2>&1' failed with rc='1'
truncated output=
mv: cannot move 'toyCmTask1279.test_rename_class_outcomes/test/outcomes/TestCase.test_check_string1' to 'toyCmTask1279.test_rename_class_outcomes/test/outcomes/TestRenamedCase.test_check_string1/TestCase.test_check_string1': Directory not empty

_____________ TestPytestRenameOutcomes.test_rename_method_outcomes _____________
Traceback (most recent call last):
  File "/app/helpers/test/test_hunit_test_utils.py", line 235, in test_rename_method_outcomes
    renamer.rename_outcomes(
  File "/app/helpers/hunit_test_utils.py", line 85, in rename_outcomes
    renamed = self._process_outcomes_dir(outcome_dir, outcomes_path)
  File "/app/helpers/hunit_test_utils.py", line 357, in _process_outcomes_dir
    self._rename_directory(outcome_path_old, outcome_path_new)
  File "/app/helpers/hunit_test_utils.py", line 188, in _rename_directory
    rc = hsystem.system(cmd, abort_on_error=True, suppress_output=False)
  File "/app/helpers/hsystem.py", line 303, in system
    rc, _ = _system(
  File "/app/helpers/hsystem.py", line 277, in _system
    raise RuntimeError(
RuntimeError: cmd='(mv toyCmTask1279.test_rename_method_outcomes/test/outcomes/TestCase.test_rename toyCmTask1279.test_rename_method_outcomes/test/outcomes/TestCase.test_method_renamed) 2>&1' failed with rc='1'
truncated output=
mv: cannot move 'toyCmTask1279.test_rename_method_outcomes/test/outcomes/TestCase.test_rename' to 'toyCmTask1279.test_rename_method_outcomes/test/outcomes/TestCase.test_method_renamed/TestCase.test_rename': Directory not empty

============================= slowest 3 durations ==============================
0.27s call     helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_class_outcomes
0.26s call     helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_method_outcomes
0.00s setup    helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_class_outcomes
=========================== short test summary info ============================
FAILED helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_class_outcomes
FAILED helpers/test/test_hunit_test_utils.py::TestPytestRenameOutcomes::test_rename_method_outcomes
============================== 2 failed in 2.58s ===============================
`
DanilYachmenev commented 1 year ago

@Sameep2808 it seems that the dirs that the test files are trying to be moved are not empty, see the logs try to follow the mentioned dirs and remove tmp files which should be there

in any case, on-boarding seems completed, tests do not seem to reflect any problems with the set-up closing