Closed gpsaggese closed 11 months ago
i pytest
to run only the tests that should pass on Sorrentum (We will do this in the end. First focus on passing all the tests)FYI @gpsaggese I edited the comment bcz interns do not have access to cmamp
@dchoi127 @Ro0k1e, you guys will work together on this task as this task is big as you have to go through all the tests which are failing and mark them accordingly. @Ro0k1e is working on his first tasks so make him get used to our procedure. If needed I can assign one more person to it.
Go through the whole conversation to get more clear idea.
There is a PR #401 already created for it. If you guys want can create a different branch and new PR or work the created one. If creating a new one than no need to do 1st step mentioned in https://github.com/sorrentum/sorrentum/issues/283#issuecomment-1629476544
When I run i run_fast_tests
the output just hangs. I've waited about 20 minutes. This is the following output. Am i doing something wrong? I've started up the amp.client_venv (but not the docker container) and ran i run_fast_tests
is this correct?
(amp.client_venv) (base) davidchoi@wireless-10-104-183-87 sorrentum1 % i run_fast_tests
INFO: > cmd='/Users/davidchoi/src/venv/amp.client_venv/bin/invoke run_fast_tests'
## run_fast_tests:
15:31:52 - INFO lib_tasks_pytest.py _run_test_cmd:219 cmd=IMAGE=sorrentum/cmamp:dev \
NETWORK_MODE=bridge \
docker-compose \
--file /Users/davidchoi/src/sorrentum1/devops/compose/docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name davidchoi.cmamp.app.sorrentum1.20230711_153152 \
--user $(id -u):$(id -g) \
app \
'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
15:31:52 - INFO lib_tasks_docker.py _docker_cmd:1252 Pulling the latest version of Docker
## docker_pull:
## docker_login:
...
... The config profile (ck) could not be found
15:31:53 - INFO lib_tasks_docker.py _docker_pull:226 image='sorrentum/cmamp:dev'
docker pull sorrentum/cmamp:dev
dev: Pulling from sorrentum/cmamp
Digest: sha256:7d9ee52407e426c8d0c6611bebb3a5e76bf05d504122aaa50bf6765dc500a2f7
Status: Image is up to date for sorrentum/cmamp:dev
docker.io/sorrentum/cmamp:dev
IMAGE=sorrentum/cmamp:dev \
NETWORK_MODE=bridge \
docker-compose \
--file /Users/davidchoi/src/sorrentum1/devops/compose/docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name davidchoi.cmamp.app.sorrentum1.20230711_153152 \
--user $(id -u):$(id -g) \
app \
'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
WARNING: The AM_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The AM_FORCE_TEST_FAIL variable is not set. Defaulting to a blank string.
WARNING: The AM_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The CK_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
Creating compose_app_run ... done
##> devops/docker_run/entrypoint.sh
UID=501
GID=20
# Activate environment
##> devops/docker_run/setenv.sh
# Set PATH
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Set PYTHONPATH
PYTHONPATH=/app:
# Configure env
git --version: git version 2.25.1
/app
WARNING: AWS credential check failed: can't find /home/.aws/credentials file.
WARNING: AWS credential check failed: can't find /home/.aws/config file.
# Check AWS authentication setup
Name Value Type Location
---- ----- ---- --------
profile am manual --profile
The config profile (am) could not be found
AM_CONTAINER_VERSION='1.4.0'
which python: /venv/bin/python
python -V: Python 3.8.10
helpers: <module 'helpers' from '/app/helpers/__init__.py'>
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH=/app:
entrypoint.sh: 'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
INFO: > cmd='/venv/bin/invoke print_env'
-----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.4.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Repo config:
# repo_config.config
enable_privileged_mode='False'
get_docker_base_image_name='cmamp'
get_docker_shared_group=''
get_docker_user=''
get_host_name='github.com'
get_html_dir_to_url_mapping='{'s3://cryptokaizen-html': 'http://172.30.2.44'}'
get_invalid_words='[]'
get_name='//sorr'
get_repo_map='{'sorr': 'sorrentum/sorrentum'}'
get_shared_data_dirs='None'
has_dind_support='False'
has_docker_sudo='True'
is_CK_S3_available='True'
run_docker_as_root='False'
skip_submodules_test='False'
use_docker_db_container_name_to_connect='True'
use_docker_network_mode_host='False'
use_docker_sibling_containers='True'
# hserver.config
is_AM_S3_available()='True'
is_dev4()='False'
is_dev_ck()='False'
is_inside_ci()='False'
is_inside_docker()='True'
is_mac(version='Catalina')='False'
is_mac(version='Monterey')='False'
is_mac(version='Ventura')='True'
# System signature:
# Git
branch_name='CmTask283_Create_test_list_to_run_with_Sorrentum'
hash='2f2b7daf5'
# Last commits:
* 2f2b7daf5 dchoi127 Gallery Notebook and Test File (#393) ( 4 days ago) Fri Jul 7 12:35:49 2023 (HEAD -> CmTask283_Create_test_list_to_run_with_Sorrentum, origin/CmTask283_Create_test_list_to_run_with_Sorrentum)
* efb7fdeb7 zli00185 Unit test get_universe_versions() for CCXT (#387) ( 5 days ago) Fri Jul 7 00:11:06 2023
* 7d5810984 hhxjqm SorrTask390_Unit_test_get_run_date (#391) ( 5 days ago) Fri Jul 7 00:10:32 2023
# Machine info
system=Linux
node name=347078c60199
release=5.10.76-linuxkit
version=#1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
machine=x86_64
processor=x86_64
cpu count=4
cpu freq=None
memory=svmem(total=2085294080, available=648499200, percent=68.9, used=1246044160, free=48664576, active=642686976, inactive=1206915072, buffers=58896384, cached=731688960, shared=100716544, slab=137158656)
disk usage=sdiskusage(total=62725623808, used=28147331072, free=31361576960, percent=47.3)
# Packages
python: 3.8.10
cvxopt: 1.3.0
cvxpy: 1.2.2
gluonnlp: ?
gluonts: 0.6.7
joblib: 1.2.0
mxnet: 1.9.1
numpy: 1.23.4
pandas: 1.5.1
pyarrow: 10.0.0
scipy: 1.9.3
seaborn: 0.12.1
sklearn: 1.1.3
statsmodels: 0.13.5
# Env vars:
AM_AWS_ACCESS_KEY_ID=undef
AM_AWS_DEFAULT_REGION=undef
AM_AWS_PROFILE='am'
AM_AWS_S3_BUCKET='alphamatic-data'
AM_AWS_SECRET_ACCESS_KEY=undef
AM_ECR_BASE_PATH='665840871993.dkr.ecr.us-east-1.amazonaws.com'
AM_ENABLE_DIND='0'
AM_FORCE_TEST_FAIL=''
AM_HOST_NAME='wireless-10-104-183-87.umd.edu'
AM_HOST_OS_NAME='Darwin'
AM_HOST_USER_NAME='davidchoi'
AM_HOST_VERSION='22.1.0'
AM_REPO_CONFIG_CHECK='True'
AM_REPO_CONFIG_PATH=''
AM_TELEGRAM_TOKEN=empty
CI=''
CK_AWS_ACCESS_KEY_ID=empty
CK_AWS_DEFAULT_REGION=''
CK_AWS_S3_BUCKET='cryptokaizen-data'
CK_AWS_SECRET_ACCESS_KEY=empty
CK_ECR_BASE_PATH='sorrentum'
GH_ACTION_ACCESS_TOKEN=empty
15:32:18 - INFO hcache.py clear_global_cache:292 Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=32.0 KB
15:32:18 - WARN hcache.py clear_global_cache:293 Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
15:32:18 - WARN hcache.py clear_global_cache:303 Destroying '/mnt/tmpfs/tmp.cache.mem' ...
15:32:18 - INFO hcache.py clear_global_cache:319 After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
====================================================================== test session starts =======================================================================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /venv/bin/python
cachedir: .pytest_cache
rootdir: /app, configfile: pytest.ini
plugins: rerunfailures-10.2, cov-4.0.0, xdist-3.0.2, anyio-3.6.2, instafail-0.4.2, timeout-2.1.0
timeout: 5.0s
timeout method: signal
timeout func_only: True
collecting 1437 items
Does the number in collecting 1437 items
stops at 1437? It should not stuck there and it will collet around 2616 items and will start the tests. Although it takes like 2-3 mins for tests to start.
Does the number in
collecting 1437 items
stops at 1437? It should not stuck there and it will collet around 2616 items and will start the tests. Although it takes like 2-3 mins for tests to start.
Yes it just stops at 1437 and hangs there indefinitely.
Try pulling the latest master or build your environment again as this is not reflected on my side.
I working inside the branch related to the PR in #401. Does that have any effect?
Probably. Try on a new branch.
Tried it on the master branch which is up to date and rebuilt the environment. Testing finally started but hangs on the second test. Waited for about 10 minutes is this the expected behavior?
(amp.client_venv) (base) davidchoi@wireless-10-104-183-87 sorrentum1 % i run_fast_tests
INFO: > cmd='/Users/davidchoi/src/venv/amp.client_venv/bin/invoke run_fast_tests'
## run_fast_tests:
16:24:48 - INFO lib_tasks_pytest.py _run_test_cmd:219 cmd=IMAGE=sorrentum/cmamp:dev \
NETWORK_MODE=bridge \
docker-compose \
--file /Users/davidchoi/src/sorrentum1/devops/compose/docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name davidchoi.cmamp.app.sorrentum1.20230711_162448 \
--user $(id -u):$(id -g) \
app \
'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
16:24:48 - INFO lib_tasks_docker.py _docker_cmd:1252 Pulling the latest version of Docker
## docker_pull:
## docker_login:
...
... The config profile (ck) could not be found
16:24:48 - INFO lib_tasks_docker.py _docker_pull:226 image='sorrentum/cmamp:dev'
docker pull sorrentum/cmamp:dev
dev: Pulling from sorrentum/cmamp
Digest: sha256:7d9ee52407e426c8d0c6611bebb3a5e76bf05d504122aaa50bf6765dc500a2f7
Status: Image is up to date for sorrentum/cmamp:dev
docker.io/sorrentum/cmamp:dev
IMAGE=sorrentum/cmamp:dev \
NETWORK_MODE=bridge \
docker-compose \
--file /Users/davidchoi/src/sorrentum1/devops/compose/docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name davidchoi.cmamp.app.sorrentum1.20230711_162448 \
--user $(id -u):$(id -g) \
app \
'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
WARNING: The AM_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The AM_FORCE_TEST_FAIL variable is not set. Defaulting to a blank string.
WARNING: The AM_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The CK_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
Creating compose_app_run ... done
##> devops/docker_run/entrypoint.sh
UID=501
GID=20
# Activate environment
##> devops/docker_run/setenv.sh
# Set PATH
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Set PYTHONPATH
PYTHONPATH=/app:
# Configure env
git --version: git version 2.25.1
/app
WARNING: AWS credential check failed: can't find /home/.aws/credentials file.
WARNING: AWS credential check failed: can't find /home/.aws/config file.
# Check AWS authentication setup
Name Value Type Location
---- ----- ---- --------
profile am manual --profile
The config profile (am) could not be found
AM_CONTAINER_VERSION='1.4.0'
which python: /venv/bin/python
python -V: Python 3.8.10
helpers: <module 'helpers' from '/app/helpers/__init__.py'>
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH=/app:
entrypoint.sh: 'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
INFO: > cmd='/venv/bin/invoke print_env'
-----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.4.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Repo config:
# repo_config.config
enable_privileged_mode='False'
get_docker_base_image_name='cmamp'
get_docker_shared_group=''
get_docker_user=''
get_host_name='github.com'
get_html_dir_to_url_mapping='{'s3://cryptokaizen-html': 'http://172.30.2.44'}'
get_invalid_words='[]'
get_name='//sorr'
get_repo_map='{'sorr': 'sorrentum/sorrentum'}'
get_shared_data_dirs='None'
has_dind_support='False'
has_docker_sudo='True'
is_CK_S3_available='True'
run_docker_as_root='False'
skip_submodules_test='False'
use_docker_db_container_name_to_connect='True'
use_docker_network_mode_host='False'
use_docker_sibling_containers='True'
# hserver.config
is_AM_S3_available()='True'
is_dev4()='False'
is_dev_ck()='False'
is_inside_ci()='False'
is_inside_docker()='True'
is_mac(version='Catalina')='False'
is_mac(version='Monterey')='False'
is_mac(version='Ventura')='True'
# System signature:
# Git
branch_name='master'
hash='f18adfb30'
# Last commits:
* f18adfb30 GP Saggese Delete UMD_CS_Phonebook_Graduate_Students.csv ( 24 hours ago) Mon Jul 10 19:57:27 2023 (HEAD -> master, origin/master, origin/SorrTask409_Unit_test_plot_timeseries_distribution, origin/HEAD)
* c7cb051c1 Yiyun Lei SorrTask396 Download info for students (#400) ( 32 hours ago) Mon Jul 10 11:56:11 2023
* 20162d520 Yiyun Lei gitignore (#404) ( 33 hours ago) Mon Jul 10 11:53:02 2023
# Machine info
system=Linux
node name=8d2140fb4311
release=5.10.76-linuxkit
version=#1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
machine=x86_64
processor=x86_64
cpu count=4
cpu freq=None
memory=svmem(total=2085294080, available=763457536, percent=63.4, used=1130868736, free=325763072, active=596193280, inactive=1001213952, buffers=37384192, cached=591278080, shared=100589568, slab=115052544)
disk usage=sdiskusage(total=62725623808, used=28147331072, free=31361576960, percent=47.3)
# Packages
python: 3.8.10
cvxopt: 1.3.0
cvxpy: 1.2.2
gluonnlp: ?
gluonts: 0.6.7
joblib: 1.2.0
mxnet: 1.9.1
numpy: 1.23.4
pandas: 1.5.1
pyarrow: 10.0.0
scipy: 1.9.3
seaborn: 0.12.1
sklearn: 1.1.3
statsmodels: 0.13.5
# Env vars:
AM_AWS_ACCESS_KEY_ID=undef
AM_AWS_DEFAULT_REGION=undef
AM_AWS_PROFILE='am'
AM_AWS_S3_BUCKET='alphamatic-data'
AM_AWS_SECRET_ACCESS_KEY=undef
AM_ECR_BASE_PATH='665840871993.dkr.ecr.us-east-1.amazonaws.com'
AM_ENABLE_DIND='0'
AM_FORCE_TEST_FAIL=''
AM_HOST_NAME='wireless-10-104-183-87.umd.edu'
AM_HOST_OS_NAME='Darwin'
AM_HOST_USER_NAME='davidchoi'
AM_HOST_VERSION='22.1.0'
AM_REPO_CONFIG_CHECK='True'
AM_REPO_CONFIG_PATH=''
AM_TELEGRAM_TOKEN=empty
CI=''
CK_AWS_ACCESS_KEY_ID=empty
CK_AWS_DEFAULT_REGION=''
CK_AWS_S3_BUCKET='cryptokaizen-data'
CK_AWS_SECRET_ACCESS_KEY=empty
CK_ECR_BASE_PATH='sorrentum'
GH_ACTION_ACCESS_TOKEN=empty
16:25:14 - INFO hcache.py clear_global_cache:292 Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=32.0 KB
16:25:14 - WARN hcache.py clear_global_cache:293 Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
16:25:14 - WARN hcache.py clear_global_cache:303 Destroying '/mnt/tmpfs/tmp.cache.mem' ...
16:25:14 - INFO hcache.py clear_global_cache:319 After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
====================================================================== test session starts =======================================================================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /venv/bin/python
cachedir: .pytest_cache
rootdir: /app, configfile: pytest.ini
plugins: rerunfailures-10.2, cov-4.0.0, xdist-3.0.2, anyio-3.6.2, instafail-0.4.2, timeout-2.1.0
timeout: 5.0s
timeout method: signal
timeout func_only: True
collecting 2598 items -----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.4.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Git
branch_name='master'
hash='f18adfb30'
# Last commits:
* f18adfb30 GP Saggese Delete UMD_CS_Phonebook_Graduate_Students.csv ( 25 hours ago) Mon Jul 10 19:57:27 2023 (HEAD -> master, origin/master, origin/SorrTask409_Unit_test_plot_timeseries_distribution, origin/HEAD)
* c7cb051c1 Yiyun Lei SorrTask396 Download info for students (#400) ( 33 hours ago) Mon Jul 10 11:56:11 2023
* 20162d520 Yiyun Lei gitignore (#404) ( 33 hours ago) Mon Jul 10 11:53:02 2023
# Machine info
system=Linux
node name=8d2140fb4311
release=5.10.76-linuxkit
version=#1 SMP PREEMPT Mon Nov 8 11:22:26 UTC 2021
machine=x86_64
processor=x86_64
cpu count=4
cpu freq=None
memory=svmem(total=2085294080, available=673005568, percent=67.7, used=1221328896, free=206532608, active=646692864, inactive=1062277120, buffers=37646336, cached=619786240, shared=100589568, slab=122363904)
disk usage=sdiskusage(total=62725623808, used=28147384320, free=31361523712, percent=47.3)
# Packages
python: 3.8.10
cvxopt: 1.3.0
cvxpy: 1.2.2
gluonnlp: ?
gluonts: 0.6.7
joblib: 1.2.0
mxnet: 1.9.1
numpy: 1.23.4
pandas: 1.5.1
pyarrow: 10.0.0
scipy: 1.9.3
seaborn: 0.12.1
sklearn: 1.1.3
statsmodels: 0.13.5
INFO: > cmd='/venv/bin/pytest -m not slow and not superslow . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun Failed: Timeout'
INFO: Saving log to file 'tmp.pytest.log'
collected 2616 items / 190 deselected / 2426 selected
core/plotting/test/test_plots.py::Test_plots::test_plot_histograms_and_lagged_scatterplot1 PASSED [ 0%]
market_data/test/test_real_time_market_data.py::TestRealTimeMarketData2::test_get_data_at_timestamp1
Nope. Fast tests usually takes less than 5 secs to execute. Probably something wrong with the system Try to co-ordinate with @Ro0k1e. If it works on his system then you can move forward becz this steps is ony required to recognize the failed tests. Once you know which tests are failing, you can inspect them and add markers accordingly which can be done on anyone's system. Again you can test on @Ro0k1e system once applied the markers.
Nope. Fast tests usually takes less than 5 secs to execute. Probably something wrong with the system Try to co-ordinate with @Ro0k1e. If it works on his system then you can move forward becz this steps is ony required to recognize the failed tests. Once you know which tests are failing, you can inspect them and add markers accordingly which can be done on anyone's system. Again you can test on @Ro0k1e system once applied the markers.
Sounds good.
@Ro0k1e could you run the command i run_fast_tests
after setting up the venv. Mine doesn't seem to work. Hopefully things work better on your system. Thanks!
If you have any questions, I'll try to answer them to the best of my knowledge.
Sorry for the late reply. I was commuting. I will try it now and post my result once I have something.
0) You want to read some doc about pytest we have docs/Unit_tests.md and also some official documentation https://docs.pytest.org/en/7.1.x/contents.html
1) You can run the pytest command directly in docker
> i docker_bash
docker> pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout" -s --dbg
to run with more debugging output.
2) You can also run a single test that is hanging
> pytest market_data/test/test_real_time_market_data.py::TestRealTimeMarketData2::test_get_data_at_timestamp1
This test uses imvcddbut.TestImDbHelper
which requires Docker-in-docker (aka dind) or sibling-docker. I would mark it as needs_dind
, together with all the unit tests that use imvcddbut.TestImDbHelper
I activated the thin environment and pulled the latest dev_tools image. Then I ran the i run_fast_tests
. Mine stopped at 1468 items. Trying the i docker_bash
stuff now.
1) What do you mean "Mine stopped at 1468 items" Can you report the output when you say?
2) Note that we are fixing a problem with Docker that makes everything slow when running on Arm, since we have only x86 images, so we go through the emulator.
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
What processor are you using?
3) On my M2 Mac I get
collecting 1354 items / 1 skipped / 1353 selected
collecting 1424 items / 1 skipped / 1423 selected
here it hangs, then if I CTRL-c it starts again
collecting 1449 items / 1 error / 1 skipped / 1447 selected
collecting 1454 items / 1 error / 1 skipped / 1452 selected
...
collecting 2485 items / 1 error / 1 skipped / 2483 selected
IMO there is some test that locks up.
I've run on one of our x86 servers and pytest started immediately
collected 2616 items / 190 deselected / 2426 selected
@Ro0k1e and @dchoi127 are you both using Arm-based Mac? If so, I think the problem is due to running the docker image through the emulator
Another piece of info: if I run my Mac in the VPC it doesn't lock up. I think some test is trying to reach some service in our VPC that is not available and so it gets stuck.
I would:
1) fix the Docker issue so we can run fast
2) run the tests by directory so that we can "bisect" where the problem is coming from, e.g., pytest <dir>
I apologize for the confusion. I will remember to paste the code and the result in the future. The following is what I had yesterday, and I am using a M1 Mac.
INFO: > cmd='/Users/ywang/src/venv/amp.client_venv/bin/invoke run_fast_tests'
## run_fast_tests:
00:04:29 - INFO lib_tasks_pytest.py _run_test_cmd:219 cmd=IMAGE=sorrentum/cmamp:dev \
NETWORK_MODE=bridge \
docker-compose \
--file /Users/ywang/src/sorrentum1/devops/compose/docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name ywang.cmamp.app.sorrentum1.20230711_210429 \
--user $(id -u):$(id -g) \
app \
'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
00:04:29 - INFO lib_tasks_docker.py _docker_cmd:1252 Pulling the latest version of Docker
## docker_pull:
## docker_login:
...
... The config profile (ck) could not be found
00:04:29 - INFO lib_tasks_docker.py _docker_pull:226 image='sorrentum/cmamp:dev'
docker pull sorrentum/cmamp:dev
dev: Pulling from sorrentum/cmamp
Digest: sha256:7d9ee52407e426c8d0c6611bebb3a5e76bf05d504122aaa50bf6765dc500a2f7
Status: Image is up to date for sorrentum/cmamp:dev
docker.io/sorrentum/cmamp:dev
IMAGE=sorrentum/cmamp:dev \
NETWORK_MODE=bridge \
docker-compose \
--file /Users/ywang/src/sorrentum1/devops/compose/docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name ywang.cmamp.app.sorrentum1.20230711_210429 \
--user $(id -u):$(id -g) \
app \
'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
WARNING: The AM_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The AM_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The AM_FORCE_TEST_FAIL variable is not set. Defaulting to a blank string.
WARNING: The AM_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_ACCESS_KEY_ID variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_DEFAULT_REGION variable is not set. Defaulting to a blank string.
WARNING: The CK_AWS_SECRET_ACCESS_KEY variable is not set. Defaulting to a blank string.
WARNING: The CK_TELEGRAM_TOKEN variable is not set. Defaulting to a blank string.
WARNING: Found orphan containers (compose-oms_postgres7855-1, compose-oms_postgres9256-1, compose-im_postgres2707-1, compose-oms_postgres3920-1, compose-im_postgres2261-1, compose-im_postgres6826-1, compose-oms_postgres4419-1, compose-oms_postgres7941-1, compose-oms_postgres9964-1, compose-im_postgres198-1, compose-im_postgres6680-1, compose-oms_postgres3151-1, compose-oms_postgres6219-1, compose-oms_postgres3477-1, compose-oms_postgres9709-1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Creating compose_app_run ... done
##> devops/docker_run/entrypoint.sh
UID=501
GID=20
# Activate environment
##> devops/docker_run/setenv.sh
# Set PATH
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Set PYTHONPATH
PYTHONPATH=/app:
# Configure env
git --version: git version 2.25.1
/app
WARNING: AWS credential check failed: can't find /home/.aws/credentials file.
WARNING: AWS credential check failed: can't find /home/.aws/config file.
# Check AWS authentication setup
Name Value Type Location
---- ----- ---- --------
profile am manual --profile
The config profile (am) could not be found
AM_CONTAINER_VERSION='1.4.0'
which python: /venv/bin/python
python -V: Python 3.8.10
helpers: <module 'helpers' from '/app/helpers/__init__.py'>
PATH=/app/documentation/scripts:/app/dev_scripts/testing:/app/dev_scripts/notebooks:/app/dev_scripts/install:/app/dev_scripts/infra:/app/dev_scripts/git:/app/dev_scripts/aws:/app/dev_scripts:/app:.:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH=/app:
entrypoint.sh: 'pytest -m "not slow and not superslow" . -o timeout_func_only=true --timeout 5 --reruns 2 --only-rerun "Failed: Timeout"'
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
INFO: > cmd='/venv/bin/invoke print_env'
-----------------------------------------------------------------------------
This code is not in sync with the container:
code_version='1.4.3' != container_version='1.4.0'
-----------------------------------------------------------------------------
You need to:
- merge origin/master into your branch with `invoke git_merge_master`
- pull the latest container with `invoke docker_pull`
# Repo config:
# repo_config.config
enable_privileged_mode='False'
get_docker_base_image_name='cmamp'
get_docker_shared_group=''
get_docker_user=''
get_host_name='github.com'
get_html_dir_to_url_mapping='{'s3://cryptokaizen-html': 'http://172.30.2.44'}'
get_invalid_words='[]'
get_name='//sorr'
get_repo_map='{'sorr': 'sorrentum/sorrentum'}'
get_shared_data_dirs='None'
has_dind_support='False'
has_docker_sudo='True'
is_CK_S3_available='True'
run_docker_as_root='False'
skip_submodules_test='False'
use_docker_db_container_name_to_connect='True'
use_docker_network_mode_host='False'
use_docker_sibling_containers='True'
# hserver.config
is_AM_S3_available()='True'
is_dev4()='False'
is_dev_ck()='False'
is_inside_ci()='False'
is_inside_docker()='True'
is_mac(version='Catalina')='False'
is_mac(version='Monterey')='False'
is_mac(version='Ventura')='True'
# System signature:
# Git
branch_name='master'
hash='f18adfb30'
# Last commits:
* f18adfb30 GP Saggese Delete UMD_CS_Phonebook_Graduate_Students.csv ( 32 hours ago) Mon Jul 10 19:57:27 2023 (HEAD -> master, origin/master, origin/SorrTask409_Unit_test_plot_timeseries_distribution, origin/HEAD)
* c7cb051c1 Yiyun Lei SorrTask396 Download info for students (#400) ( 2 days ago) Mon Jul 10 11:56:11 2023
* 20162d520 Yiyun Lei gitignore (#404) ( 2 days ago) Mon Jul 10 11:53:02 2023
# Machine info
system=Linux
node name=9c81753e234c
release=5.15.49-linuxkit-pr
version=#1 SMP PREEMPT Thu May 25 07:27:39 UTC 2023
machine=x86_64
processor=x86_64
cpu count=4
cpu freq=None
memory=svmem(total=8232951808, available=6388469760, percent=22.4, used=1313353728, free=3372355584, active=501915648, inactive=3657728000, buffers=105328640, cached=3441913856, shared=323547136, slab=547713024)
disk usage=sdiskusage(total=62671097856, used=10935771136, free=48518610944, percent=18.4)
# Packages
python: 3.8.10
cvxopt: 1.3.0
cvxpy: 1.2.2
gluonnlp: ?
gluonts: 0.6.7
joblib: 1.2.0
mxnet: 1.9.1
numpy: 1.23.4
pandas: 1.5.1
pyarrow: 10.0.0
scipy: 1.9.3
seaborn: 0.12.1
sklearn: 1.1.3
statsmodels: 0.13.5
# Env vars:
AM_AWS_ACCESS_KEY_ID=undef
AM_AWS_DEFAULT_REGION=undef
AM_AWS_PROFILE='am'
AM_AWS_S3_BUCKET='alphamatic-data'
AM_AWS_SECRET_ACCESS_KEY=undef
AM_ECR_BASE_PATH='665840871993.dkr.ecr.us-east-1.amazonaws.com'
AM_ENABLE_DIND='0'
AM_FORCE_TEST_FAIL=''
AM_HOST_NAME='Yuans-MacBook-Air.local'
AM_HOST_OS_NAME='Darwin'
AM_HOST_USER_NAME='ywang'
AM_HOST_VERSION='22.5.0'
AM_REPO_CONFIG_CHECK='True'
AM_REPO_CONFIG_PATH=''
AM_TELEGRAM_TOKEN=empty
CI=''
CK_AWS_ACCESS_KEY_ID=empty
CK_AWS_DEFAULT_REGION=''
CK_AWS_S3_BUCKET='cryptokaizen-data'
CK_AWS_SECRET_ACCESS_KEY=empty
CK_ECR_BASE_PATH='sorrentum'
GH_ACTION_ACCESS_TOKEN=empty
00:04:54 - INFO hcache.py clear_global_cache:292 Before clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=32.0 KB
00:04:54 - WARN hcache.py clear_global_cache:293 Resetting 'global mem' cache '/mnt/tmpfs/tmp.cache.mem'
00:04:54 - WARN hcache.py clear_global_cache:303 Destroying '/mnt/tmpfs/tmp.cache.mem' ...
00:04:54 - INFO hcache.py clear_global_cache:319 After clear_global_cache: 'global mem' cache: path='/mnt/tmpfs/tmp.cache.mem', size=nan
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /venv/bin/python
cachedir: .pytest_cache
rootdir: /app, configfile: pytest.ini
plugins: xdist-3.0.2, instafail-0.4.2, rerunfailures-10.2, timeout-2.1.0, cov-4.0.0, anyio-3.6.2
timeout: 5.0s
timeout method: signal
timeout func_only: True
collecting 1005 items
I am using a M1 mac processor as well.
Another piece of info: if I run my Mac in the VPC it doesn't lock up. I think some test is trying to reach some service in our VPC that is not available and so it gets stuck.
I would:
- fix the Docker issue so we can run fast
- run the tests by directory so that we can "bisect" where the problem is coming from, e.g.,
pytest <dir>
I think I'm a bit confused on how to proceed. I would greatly appreciate it if you could point me in the right direction. I'm unsure where to start with fixing the Docker issue. Could you point me to the code?
Thanks!
- Merge master in the branch
- Run fast and slow tests. You can see the failures
- Overall goal is to skip the tests which are failing by appropriate markers. Basically the pass tests should be 100% with the required tests being skipped
- All tests require docker so no need for that marker
- Take a look at the failures. Mostly the tests which fails will require AWS authentication.
- Figure out the "reasons" why chunks of tests fail (e.g., AWS_AM, AWS_CK, ...)
- Create markers for each of them
- We can tweak
i pytest
to run only the tests that should pass on Sorrentum (We will do this in the end. First focus on passing all the tests)- Feel free to ask any questions you have here.
FYI @gpsaggese I edited the comment bcz interns do not have access to
cmamp
Hi! I reran i run_fast_tests
this morning on my M1 Mac and this time it gave me the result. Still, I am a bit confused about how we should create markers for the files that failed the tests. Could you clarify what a marker is?
Thank you!
- Merge master in the branch
- Run fast and slow tests. You can see the failures
- Overall goal is to skip the tests which are failing by appropriate markers. Basically the pass tests should be 100% with the required tests being skipped
- All tests require docker so no need for that marker
- Take a look at the failures. Mostly the tests which fails will require AWS authentication.
- Figure out the "reasons" why chunks of tests fail (e.g., AWS_AM, AWS_CK, ...)
- Create markers for each of them
- We can tweak
i pytest
to run only the tests that should pass on Sorrentum (We will do this in the end. First focus on passing all the tests)- Feel free to ask any questions you have here.
FYI @gpsaggese I edited the comment bcz interns do not have access to
cmamp
Hi! I reran
i run_fast_tests
this morning on my M1 Mac and this time it gave me the result. Still, I am a bit confused about how we should create markers for the files that failed the tests. Could you clarify what a marker is?Thank you!
Here is a reference to pytest markers. Hope this helps!
https://docs.pytest.org/en/7.1.x/example/markers.html
Additionally, if you browse the files in the PR #401, there are custom pytest markers there.
1) @samarth9008 is fixing the Docker issue, no worries.
2) If you can run the entire i pytest
great and you can upload the output. Something like:
> i run_fast_tests 2>&1 | tee out.log
then you upload the file to see which tests failed
3) Learn how to use pytest
4) Then you need to understand why the test failed. Typically from the error is pretty clear. As to markers, you can look at the PR that Alejandro started.
You need add markers like:
@pytest.mark.requires_docker
class TestModelEvaluator1(hunitest.TestCase):
Then you need to add to pytest.ini
requires_docker: tests that can only be expected to succeed when running in docker container
There are several reasons (e.g., on top of my head, need Docker-in-docker, need AWS_S3_AM, AWS_S3_CK, ...)
- Merge master in the branch
- Run fast and slow tests. You can see the failures
- Overall goal is to skip the tests which are failing by appropriate markers. Basically the pass tests should be 100% with the required tests being skipped
- All tests require docker so no need for that marker
- Take a look at the failures. Mostly the tests which fails will require AWS authentication.
- Figure out the "reasons" why chunks of tests fail (e.g., AWS_AM, AWS_CK, ...)
- Create markers for each of them
- We can tweak
i pytest
to run only the tests that should pass on Sorrentum (We will do this in the end. First focus on passing all the tests)- Feel free to ask any questions you have here.
FYI @gpsaggese I edited the comment bcz interns do not have access to
cmamp
Hi! I reran
i run_fast_tests
this morning on my M1 Mac and this time it gave me the result. Still, I am a bit confused about how we should create markers for the files that failed the tests. Could you clarify what a marker is?Thank you!
Hey Yuan. Could send me the file with the test results? (I misread your comment and thought you still couldn't run it)
Also just wondering, did you do anything special to get the command working? Were you inside a container, only inside the venv, etc?
Thanks
- Merge master in the branch
- Run fast and slow tests. You can see the failures
- Overall goal is to skip the tests which are failing by appropriate markers. Basically the pass tests should be 100% with the required tests being skipped
- All tests require docker so no need for that marker
- Take a look at the failures. Mostly the tests which fails will require AWS authentication.
- Figure out the "reasons" why chunks of tests fail (e.g., AWS_AM, AWS_CK, ...)
- Create markers for each of them
- We can tweak
i pytest
to run only the tests that should pass on Sorrentum (We will do this in the end. First focus on passing all the tests)- Feel free to ask any questions you have here.
FYI @gpsaggese I edited the comment bcz interns do not have access to
cmamp
Hi! I reran
i run_fast_tests
this morning on my M1 Mac and this time it gave me the result. Still, I am a bit confused about how we should create markers for the files that failed the tests. Could you clarify what a marker is? Thank you!Hey Yuan. Could send me the file with the test results? (I misread your comment and thought you still couldn't run it)
Also just wondering, did you do anything special to get the command working? Were you inside a container, only inside the venv, etc?
Thanks
Sorry for the very late reply since I did not check messages and planned to work at night. I did not do anything special. I just opened docker and ran source dev_scripts/setenv_amp.sh
and i run_fast_tests
in terminal. There is no guarantee that the program will not hang indefinitely but this is the result out.log.
Hi David! This is an update of my findings, and I wonder if I can double-check with you.
It seems that all the failures are due to
* Failed assertion *
File '/home/.aws/credentials' doesn't exist
, which I believe is a lack of AWS credentials.
However, it is a bit weird since I think the function that failed to pass the test should have been skipped in the first place. For example, the first failure message is
=================================== FAILURES ===================================
________ TestTalosHistoricalPqByTileClient2.test_get_end_ts_for_symbol1 ________
Traceback (most recent call last):
File "/app/im_v2/talos/data/client/test/test_talos_clients.py", line 632, in test_get_end_ts_for_symbol1
self._test_get_end_ts_for_symbol1(
File "/app/im_v2/common/data/client/im_client_test_case.py", line 298, in _test_get_end_ts_for_symbol1
actual_end_ts = im_client.get_end_ts_for_symbol(full_symbol)
File "/app/im_v2/common/data/client/base_im_clients.py", line 300, in get_end_ts_for_symbol
return self._get_start_end_ts_for_symbol(full_symbol, mode)
File "/app/im_v2/common/data/client/base_im_clients.py", line 523, in _get_start_end_ts_for_symbol
data = self.read_data(
File "/app/im_v2/common/data/client/base_im_clients.py", line 198, in read_data
df = self._read_data(
File "/app/im_v2/common/data/client/base_im_clients.py", line 656, in _read_data
df = self._read_data_for_multiple_symbols(
File "/app/im_v2/common/data/client/historical_pq_clients.py", line 173, in _read_data_for_multiple_symbols
root_dir_df = hparque.from_parquet(root_dir, **kwargs)
File "/app/helpers/hparquet.py", line 115, in from_parquet
filesystem = get_pyarrow_s3fs(aws_profile)
File "/app/helpers/hparquet.py", line 48, in get_pyarrow_s3fs
aws_credentials = hs3.get_aws_credentials(*args, **kwargs)
File "/app/helpers/hs3.py", line 696, in get_aws_credentials
config = _get_aws_config(file_name)
File "/app/helpers/hs3.py", line 493, in _get_aws_config
hdbg.dassert_file_exists(file_name)
File "/app/helpers/hdbg.py", line 762, in dassert_file_exists
_dfatal(txt, msg, *args, only_warning=only_warning)
File "/app/helpers/hdbg.py", line 142, in _dfatal
dfatal(dfatal_txt)
File "/app/helpers/hdbg.py", line 71, in dfatal
raise assertion_type(ret)
AssertionError:
################################################################################
* Failed assertion *
File '/home/.aws/credentials' doesn't exist
As I checked the code, there is a pytest marker before the declaration of the function:
@pytest.mark.skipif(
not henv.execute_repo_config_code("is_CK_S3_available()"),
reason="Run only if CK S3 is available",
)
So I guess there is some problem with the function is_CK_S3_available()
. Although I am still working on it, would you mind taking a look at it as well?
Thanks!
Let us know by Wednesday how you guys are progressing with the issue. If you find it difficult and complex, no worries. We will assign you guys to a different issue. We understand its a little complex and complicated.
@Ro0k1e
There seems to be a couple failures apart from AWS credentials. After looking through the whole out file it seems the issues are
RuntimeError: cmd='(docker container ls --filter name=/compose-oms_postgres3221-1 -aq) 2>&1' failed with rc='1'
truncated output=
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22name%22%3A%7B%22%2Fcompose-oms_postgres3221-1%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Thus a docker issue (might be because we're not running the command inside a container)
* Failed assertion *
File '/home/.aws/credentials' doesn't exist
As you mentioned, AWS credentials missing.
Failed: Timeout >5.0s
Tests running for too long and thus timing out.
Haven't looked through the files yet. Doing that now.
Currently looking at the first failed test case which is oms/test/test_restrictions.py::TestRestrictions1::test2
.
However when looking at the logs of the output file it appears that this test is ran twice with the first run passing and the second failing with an error.
oms/test/test_restrictions.py::TestRestrictions1::test2 PASSED [ 1%]
oms/test/test_restrictions.py::TestRestrictions1::test2 ERROR [ 1%]
However, it appears that the command we are executing only reruns tests that fail based on Failed:Timeout
. Can anyone tell me as to why this command is running twice?
Also, as this test does run successfully, is this a test I should skip or something I should mark as well. I'm not sure how to mark this test because it seems that it does pass.
Below is the full error message corresponding to the test
_________________ ERROR at teardown of TestRestrictions1.test2 _________________
Traceback (most recent call last):
File "/app/helpers/hsql_test.py", line 124, in tearDownClass
hdocker.container_rm(container_name)
File "/app/helpers/hdocker.py", line 21, in container_rm
_, container_id = hsystem.system_to_one_line(cmd)
File "/app/helpers/hsystem.py", line 401, in system_to_one_line
rc, output = system_to_string(cmd, *args, **kwargs)
File "/app/helpers/hsystem.py", line 344, in system_to_string
rc, output = _system(
File "/app/helpers/hsystem.py", line 277, in _system
raise RuntimeError(
RuntimeError: cmd='(docker container ls --filter name=/compose-oms_postgres7016-1 -aq) 2>&1' failed with rc='1'
truncated output=
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22name%22%3A%7B%22%2Fcompose-oms_postgres7016-1%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Additionally, after reviewing all tests that fail with
RuntimeError: cmd='(docker container ls --filter name=/compose-im_postgres2555-1 -aq) 2>&1' failed with rc='1'
truncated output=
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22name%22%3A%7B%22%2Fcompose-im_postgres2555-1%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
They all follow the same trend of where they run twice. The first run passes and the second fails on error.
@Ro0k1e
There seems to be a couple failures apart from AWS credentials. After looking through the whole out file it seems the issues are
RuntimeError: cmd='(docker container ls --filter name=/compose-oms_postgres3221-1 -aq) 2>&1' failed with rc='1' truncated output= Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22name%22%3A%7B%22%2Fcompose-oms_postgres3221-1%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Thus a docker issue (might be because we're not running the command inside a container)
* Failed assertion * File '/home/.aws/credentials' doesn't exist
As you mentioned, AWS credentials missing.
Failed: Timeout >5.0s
Tests running for too long and thus timing out.
Haven't looked through the files yet. Doing that now.
Ah, my mistake. I only focused on failure messages and ignore errors. Thank you!
@Ro0k1e There seems to be a couple failures apart from AWS credentials. After looking through the whole out file it seems the issues are
RuntimeError: cmd='(docker container ls --filter name=/compose-oms_postgres3221-1 -aq) 2>&1' failed with rc='1' truncated output= Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22name%22%3A%7B%22%2Fcompose-oms_postgres3221-1%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Thus a docker issue (might be because we're not running the command inside a container)
* Failed assertion * File '/home/.aws/credentials' doesn't exist
As you mentioned, AWS credentials missing.
Failed: Timeout >5.0s
Tests running for too long and thus timing out. Haven't looked through the files yet. Doing that now.
Ah, my mistake. I only focused on failure messages and ignore errors. Thank you!
That's a good point. @samarth9008 should we overlook errors and only focus on failures?
Guys I think this is a tricky one since it requires debugging and so on. I know you can complete the task, but it's going to be a lot of churn. Feel free to push whatever you have in a PR and I'll take care of this.
@samarth9008 and @DanilYachmenev can you pls assign something more coding-related to our 2 heroes?
@LibertasSpZ for now go over the discussion to get familiar with the issue and share a brief of your understanding afterwards @gpsaggese will instruct you more on the next steps
@LibertasSpZ feel free to start reading this issue (several people attempted, but it was a bit tricky) and try your hand on it. If you get stuck you can pick a time on my calendar for the PP https://calendly.com/gsaggese/30mins-et-afternoon?month=2023-07
If you need bugs in the meantime just ask @DanilYachmenev for more workload
Thanks for the assignment, @gpsaggese and @DanilYachmenev . I run i run_fast_tests > output.txt 2>&1
on the new branch, and indeed found
(1) The * Failed assertion * File '/home/.aws/credentials' doesn't exist
issue that @dchoi127 and @Ro0k1e mentioned (thank you for identifying these issues btw) and
(2) Some tests PASS
in the first run and FAIL
in the rerun, as @dchoi127 noted. But not all failed tests follow this pattern, for example on my end this one seems to fail in the first run:
im_v2/ccxt/data/client/test/test_ccxt_clients.py::TestCcxtHistoricalPqByTileClient1::test_read_data1 (0.01 s) [31mFAILED[0m[31m [ 24%][0m
Am I correct in understand that the task is to inspect into the failed and error-ed tests, figure out the cause of the behavior above, and cure if possible?
The problem is that the tests pass on our server, but they fail on people's laptop since they are not inside the VPN and / or don't have certain resources available.
We want to mark the tests that fail with the reason they fail (as per https://github.com/sorrentum/sorrentum/issues/283#issuecomment-1633173271) so that we can get to the point where the failing tests are skipped, and one
The first step is to post the entire output to see which tests are failing. Then we can do a quick PP session and I can show you what to do. Pick a time here https://calendly.com/gsaggese/30mins-et-afternoon?month=2023-07
Thanks for the clarification, @gpsaggese . For the entire output please see the attachment out_log.txt .
I will learn more about marking the tests from the linked comment and PR 401.
I will make the appointment in a bit.
@LibertasSpZ
I've put some notes here https://docs.google.com/document/d/1Qm54LwNRlBwzYroDGviW32NevK_V8L4Sufsg5u6IKwI/edit#heading=h.6bhp1lld1a4v
We want to go in order: 1) find classes of failures and think about solutions 2) add pytest markers 3) re-run pytest and see what's left
@LibertasSpZ thx for the log file!
So let's process it a bit and make a more readable report for GP After a brief look I've found 4 types of errors that almost all the failing tests can be separated at:
1) Docker issue log:
RuntimeError: cmd='(docker container ls --filter name=/compose-oms_postgres7384-1 -aq) 2>&1' failed with rc='1'
truncated output=
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22name%22%3A%7B%22%2Fcompose-oms_postgres7384-1%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
probably just need to add @pytest.mark.requires_docker
on top as suggested in https://github.com/sorrentum/sorrentum/issues/283#issuecomment-1633173271
but need to clarify with @gpsaggese
2) Missing aws credentials log:
AssertionError:
################################################################################
* Failed assertion *
File '/home/.aws/credentials' doesn't exist
################################################################################
same as p.1
3) Timeout log
Failed: Timeout >5.0s
these ones are potentially really easy to fix - just changing the marker on top of it to make test slow however, would be good to understand why som many tests dropped out of a time limit
4) Changes in goldens log is smth that demonstrates an actual txt output on the left and the diverting expected lines on the right
we need to figure out what are the changes there, are they expected
and if ok, just run them with --update_outcomes
In any case, could you pls process all the failures and as the first step just provide the lists of tests for each of these error types (and also add anther group if it exists) Overall it does seem big but errors are mostly the same so could be much faster than it looks like
@DanilYachmenev good to see that you read my mind. Let's work on the gdoc together
Let's skateboard @DanilYachmenev: let's start disabling all the failing tests by adding a marker requires_cmamp
and we can enable the tests as https://github.com/sorrentum/sorrentum/issues/282
We can file bugs for each "class of issues" and distribute them as outsourceable.
Let's skateboard @DanilYachmenev: let's start disabling all the failing tests by adding a marker
requires_cmamp
and we can enable the tests as #282We can file bugs for each "class of issues" and distribute them as outsourceable.
to clarify the 1st step
pytest.mark.skip
on top of them to disablepytest.mark.requires_cmamp
on top of them so we can find them in further iterationsClose. Let's mark all the tests that are failing in Sorrentum with pytest.mark.requires_cmamp
and then we force these tests to be skipped in the invoke (I can work on this with @LibertasSpZ, I think we have a PP session today)
@gpsaggese I can put the requires_cmamp
marks during or after our PP session today. And thank you very much @DanilYachmenev for classifying the 4 common types of failures.
By the way @DanilYachmenev @gpsaggese in PR #473 I fixed a failure which does not belong to any common class. The failing function test_plots.py::test_plot_heatmap1
was:
(1) calling get_plot_heatmap1
by the wrong name get_plot_heatmap
, and
(2) along the trace, using deprecated alias np.float
, which I changed to np.float64
per pytest
's request.
If these changes especially (2) are okay, I believe #473 is ready for merge, and we can focus on marking the common types discussed above.
If you have a sec take a look at, where I also indicated solutions of the errors
Thanks. Took a glance. Maybe for the timeouts [Category 1] we can mark them as slow
instead.
From https://github.com/sorrentum/sorrentum/issues/189#issuecomment-1563426753
Contributors can use repos outside our infra on their laptop, thus some tests might not work (e.g., if there is a dependency on AWS).
We want to mark unit tests based on what kind of support is needed, then contributors can just run our
pytest
flow skipping all the tests that are not expected to work outside our infra.no_aws
, ...)Assigning to @samarth9008 as current master of outsourcing. We can do a quick PR to get the skeleton in place. @PomazkinG and @jsmerix can help.