Closed miguelgfierro closed 2 weeks ago
Perhaps we should give it a bit more time until timeout in case the machine is taking a while to start?
I run it again and got a slightly different error: https://github.com/recommenders-team/recommenders/actions/runs/10412300049/job/28895746683
We need to check whether the VMs can be accessed.
Traceback (most recent call last):
File "/home/runner/work/recommenders/recommenders/tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py", line 175, in <module>
run_tests(
File "/home/runner/work/recommenders/recommenders/tests/ci/azureml_tests/aml_utils.py", line 171, in run_tests
job = client.jobs.create_or_update(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 94, in wrapper_use_tracer
return func(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py", line 372, in wrapper
return_value = f(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py", line 663, in create_or_update
self._resolve_arm_id_or_upload_dependencies(job)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py", line 1070, in _resolve_arm_id_or_upload_dependencies
self._resolve_arm_id_or_azureml_id(job, self._orchestrators.get_asset_arm_id)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py", line 1335, in _resolve_arm_id_or_azureml_id
job = self._resolve_arm_id_for_command_job(job, resolver)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py", line 1387, in _resolve_arm_id_for_command_job
job.environment = resolver(job.environment, azureml_type=AzureMLResourceType.ENVIRONMENT)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_operation_orchestrator.py", line 183, in get_asset_arm_id
name, version = self._resolve_name_version_from_name_label(asset, azureml_type)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_operation_orchestrator.py", line 443, in _resolve_name_version_from_name_label
_resolve_label_to_asset(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/_utils/_asset_utils.py", line [102](https://github.com/recommenders-team/recommenders/actions/runs/10412300049/job/28895746683#step:3:108)2, in _resolve_label_to_asset
return resolver(name)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_environment_operations.py", line 448, in _get_latest_version
result = _get_latest(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/_utils/_asset_utils.py", line 853, in _get_latest
latest = result.next()
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/core/paging.py", line 123, in __next__
return next(self._page_iterator)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/core/paging.py", line 75, in __next__
self._response = self._get_next(self.continuation_token)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/_restclient/v2023_04_01_preview/operations/_environment_versions_operations.py", line 333, in get_next
map_error(status_code=response.status_code, response=response, error_map=error_map)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/core/exceptions.py", line 161, in map_error
raise error
azure.core.exceptions.ResourceNotFoundError: (UserError) System.Net.Http.HttpConnectionResponseContent
Code: UserError
Message: System.Net.Http.HttpConnectionResponseContent
Added a new cluster, with the same details as before (Standard_F32s_v2 (32 cores, 64 GB RAM, 256 GB disk, low priority, $0.27/hr per node), but didn't work either https://github.com/recommenders-team/recommenders/actions/runs/10454246686
If I try to list the jobs in that cluster I get an error:
Failed to load the list of jobs
ServiceUnavailable: Service temporarily unavailable. Please try again later
Trace ID : a8feeddd-b981-4190-869c-adb9f2cf771f
Client request ID : 088f07ab-4f5e-4e6d-8dc5-5bf79b5218dd
Created a new cluster with dedicated VMs: Standard_F32s_v2 (32 cores, 64 GB RAM, 256 GB disk), dedicated, $1.35/hr per node and triggered the tests: https://github.com/recommenders-team/recommenders/actions/runs/10454433280
Same error as in https://github.com/recommenders-team/recommenders/pull/2148 No jobs appeared.
Pinned azure-ai-ml==1.18.0
(the version before 1.19.0 on Aug 13). Got the same error. See
@miguelgfierro Could you check if there is a experiment named recommenders-nightly-group_spark_001-python3_9-refs_heads_simonz_azure-ai-ml-1_18_0
in the AML workspace for testing?
For the record, the last working test was on August 11th: https://github.com/recommenders-team/recommenders/actions/runs/10335388423
I created a ticket for the AzureML support team: https://ms.portal.azure.com/#view/Microsoft_Azure_Support/SupportRequestDetails.ReactView/id/%2Fsubscriptions%2Fa30dea4f-623a-4bc5-808b-cbf9bc00f7a1%2Fproviders%2FMicrosoft.Support%2FsupportTickets%2Fa1799293-d0bf2ad9-e78876bc-7dd4-47b7-b8eb-215f1017f060
All tests have passed after I re-run them. It may be the problem of Azure service. See:
I'm running all the tests again to check:
Some are failing, this is very weird, there is no difference in the AzureML code between staging and main. The error is the same:
They have created an ICM ticket, it seems there are issues in US East region.
Now
let me merge https://github.com/recommenders-team/recommenders/pull/2145 and see if everything is green
It seems everything is green except the CPU nightly Running staging again: https://github.com/recommenders-team/recommenders/actions/runs/10525482429
It seems there is an error with MIND:
Traceback (most recent call last):
File "/home/runner/work/recommenders/recommenders/tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py", line 175, in <module>
Execution Summary
run_tests(
File "/home/runner/work/recommenders/recommenders/tests/ci/azureml_tests/aml_utils.py", line 184, in run_tests
client.jobs.stream(job.name)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 94, in wrapper_use_tracer
return func(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py", line 292, in wrapper
return f(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_job_operations.py", line 817, in stream
self._stream_logs_until_completion(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/azure/ai/ml/operations/_job_ops_helper.py", line 334, in stream_logs_until_completion
raise JobException(
azure.ai.ml.exceptions.JobException: Exception :
***
"error": ***
"code": "UserError",
"message": "Execution failed. User process 'python' exited with status code 1. Please check log file 'user_logs/std_log.txt' for error details. Error: 0.04s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDlarge_utils.zip-150359301-0x8D8B8AD5B2ED4C9]\n0.04s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[10m-10681-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]\n0.04s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[10m]\n0.04s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDlarge_train.zip-53136[123](https://github.com/recommenders-team/recommenders/actions/runs/10525482429/job/29164368789#step:3:129)7-0x8D8244E90C15C07]\n0.03s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[10m-10000054-10681-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]\n0.01s teardown tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_small\n0.01s teardown tests/data_validation/recommenders/datasets/test_mind.py::test_download_mind_small\n\n(60 durations < 0.005s hidden. Use -vv to show these durations.)\n=========================== short test summary info ============================\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDsmall_train.zip-52953372-0x8D834F2EB31BDEC]\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDsmall_dev.zip-30946172-0x8D834F2EBA8D865]\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDsmall_utils.zip-155178106-0x8D87F67F4AEB960]\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDlarge_train.zip-531361237-0x8D8244E90C15C07]\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDlarge_dev.zip-103593383-0x8D8244E92005849]\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDlarge_utils.zip-150359301-0x8D8B8AD5B2ED4C9]\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_small\nFAILED tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_large\nFAILED tests/data_validation/examples/test_mind.py::test_mind_utils_runs - nb...\n============= 9 failed, 25 passed, 1 warning in 691.98s (0:11:31) ==============\n"
It is not very clear what it is.
Something weird that happened is that the logs of the tests were not fully written:
Dumping logs in user_logs/std_log.txt
=====================================
============================= test session starts ==============================
platform linux -- Python 3.9.19, pytest-8.3.2, pluggy-1.5.0
rootdir: /mnt/azureml/cr/j/cbe7d84286194babb3a4e8f83d1aff91/exe/wd
configfile: pyproject.toml
plugins: cov-5.0.0, hypothesis-6.108.5, mock-3.14.0, typeguard-4.3.0, anyio-4.4.0
collected 34 items
tests/data_validation/recommenders/datasets/test_movielens.py .......... [ 29%]
It should get pass 29% and write the errors.
There are 2 logs as I found in the summary of the testing.
============================= test session starts ==============================
platform linux -- Python 3.9.19, pytest-8.3.2, pluggy-1.5.0
rootdir: /mnt/azureml/cr/j/cbe7d84286194babb3a4e8f83d1aff91_2/exe/wd
configfile: pyproject.toml
plugins: cov-5.0.0, hypothesis-6.108.5, mock-3.14.0, typeguard-4.3.0, anyio-4.4.0
collected 34 items
tests/data_validation/recommenders/datasets/test_movielens.py .......... [ 29%]
.. [ 35%]
tests/data_validation/recommenders/datasets/test_mind.py ...FFFFFF..FFFF [ 79%]
[ 79%]
tests/data_validation/examples/test_mind.py F. [ 85%]
tests/data_validation/examples/test_wikidata.py .. [ 91%]
tests/smoke/examples/test_notebooks_python.py .. [ 97%]
tests/functional/examples/test_notebooks_python.py . [100%]
=================================== FAILURES ===================================
_ test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip-52952752-0x8D834F2EB31BDEC] _
url = 'https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip'
content_length = '52952752', etag = '0x8D834F2EB31BDEC'
@pytest.mark.parametrize(
"url, content_length, etag",
[
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip",
"17372879",
'"0x8D8B8AD5B233930"',
), # NOTE: the z20 blob returns the etag with ""
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip",
"10080022",
'"0x8D8B8AD5B188839"',
),
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip",
"97292694",
'"0x8D8B8AD5B126C3B"',
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip",
"52952752",
"0x8D834F2EB31BDEC",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip",
"30945572",
"0x8D834F2EBA8D865",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip",
"155178106",
"0x8D87F67F4AEB960",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip",
"530196631",
"0x8D8244E90C15C07",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip",
"103456245",
"0x8D8244E92005849",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip",
"150359301",
"0x8D87F67E6CA4364",
),
],
)
def test_mind_url(url, content_length, etag):
url_headers = requests.head(url).headers
> assert url_headers["Content-Length"] == content_length
tests/data_validation/recommenders/datasets/test_mind.py:63:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = {'Transfer-Encoding': 'chunked', 'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0', 'x-ms-request-id': '33dc6f45-d01e-0043-671a-e3db46000000', 'x-ms-version': '2009-09-19', 'Date': 'Wed, 31 Jul 2024 07:19:39 GMT'}
key = 'Content-Length'
def __getitem__(self, key):
> return self._store[key.lower()][1]
E KeyError: 'content-length'
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/structures.py:52: KeyError
_ test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip-30945572-0x8D834F2EBA8D865] _
url = 'https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip'
content_length = '30945572', etag = '0x8D834F2EBA8D865'
@pytest.mark.parametrize(
"url, content_length, etag",
[
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip",
"17372879",
'"0x8D8B8AD5B233930"',
), # NOTE: the z20 blob returns the etag with ""
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip",
"10080022",
'"0x8D8B8AD5B188839"',
),
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip",
"97292694",
'"0x8D8B8AD5B126C3B"',
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip",
"52952752",
"0x8D834F2EB31BDEC",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip",
"30945572",
"0x8D834F2EBA8D865",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip",
"155178106",
"0x8D87F67F4AEB960",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip",
"530196631",
"0x8D8244E90C15C07",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip",
"103456245",
"0x8D8244E92005849",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip",
"150359301",
"0x8D87F67E6CA4364",
),
],
)
def test_mind_url(url, content_length, etag):
url_headers = requests.head(url).headers
> assert url_headers["Content-Length"] == content_length
tests/data_validation/recommenders/datasets/test_mind.py:63:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = {'Transfer-Encoding': 'chunked', 'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0', 'x-ms-request-id': 'fefeed5f-e01e-004b-0d1a-e3c035000000', 'x-ms-version': '2009-09-19', 'Date': 'Wed, 31 Jul 2024 07:19:39 GMT'}
key = 'Content-Length'
def __getitem__(self, key):
> return self._store[key.lower()][1]
E KeyError: 'content-length'
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/structures.py:52: KeyError
_ test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip-155178106-0x8D87F67F4AEB960] _
url = 'https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip'
content_length = '155178106', etag = '0x8D87F67F4AEB960'
@pytest.mark.parametrize(
"url, content_length, etag",
[
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip",
"17372879",
'"0x8D8B8AD5B233930"',
), # NOTE: the z20 blob returns the etag with ""
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip",
"10080022",
'"0x8D8B8AD5B188839"',
),
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip",
"97292694",
'"0x8D8B8AD5B126C3B"',
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip",
"52952752",
"0x8D834F2EB31BDEC",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip",
"30945572",
"0x8D834F2EBA8D865",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip",
"155178106",
"0x8D87F67F4AEB960",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip",
"530196631",
"0x8D8244E90C15C07",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip",
"103456245",
"0x8D8244E92005849",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip",
"150359301",
"0x8D87F67E6CA4364",
),
],
)
def test_mind_url(url, content_length, etag):
url_headers = requests.head(url).headers
> assert url_headers["Content-Length"] == content_length
tests/data_validation/recommenders/datasets/test_mind.py:63:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = {'Transfer-Encoding': 'chunked', 'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0', 'x-ms-request-id': '2fe9e7aa-b01e-0058-431a-e3f5d4000000', 'x-ms-version': '2009-09-19', 'Date': 'Wed, 31 Jul 2024 07:19:40 GMT'}
key = 'Content-Length'
def __getitem__(self, key):
> return self._store[key.lower()][1]
E KeyError: 'content-length'
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/structures.py:52: KeyError
_ test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip-530196631-0x8D8244E90C15C07] _
url = 'https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip'
content_length = '530196631', etag = '0x8D8244E90C15C07'
@pytest.mark.parametrize(
"url, content_length, etag",
[
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip",
"17372879",
'"0x8D8B8AD5B233930"',
), # NOTE: the z20 blob returns the etag with ""
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip",
"10080022",
'"0x8D8B8AD5B188839"',
),
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip",
"97292694",
'"0x8D8B8AD5B126C3B"',
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip",
"52952752",
"0x8D834F2EB31BDEC",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip",
"30945572",
"0x8D834F2EBA8D865",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip",
"155178106",
"0x8D87F67F4AEB960",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip",
"530196631",
"0x8D8244E90C15C07",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip",
"103456245",
"0x8D8244E92005849",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip",
"150359301",
"0x8D87F67E6CA4364",
),
],
)
def test_mind_url(url, content_length, etag):
url_headers = requests.head(url).headers
> assert url_headers["Content-Length"] == content_length
tests/data_validation/recommenders/datasets/test_mind.py:63:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = {'Transfer-Encoding': 'chunked', 'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0', 'x-ms-request-id': 'cd2dfa64-d01e-006a-451a-e3ad04000000', 'x-ms-version': '2009-09-19', 'Date': 'Wed, 31 Jul 2024 07:19:40 GMT'}
key = 'Content-Length'
def __getitem__(self, key):
> return self._store[key.lower()][1]
E KeyError: 'content-length'
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/structures.py:52: KeyError
_ test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip-103456245-0x8D8244E92005849] _
url = 'https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip'
content_length = '103456245', etag = '0x8D8244E92005849'
@pytest.mark.parametrize(
"url, content_length, etag",
[
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip",
"17372879",
'"0x8D8B8AD5B233930"',
), # NOTE: the z20 blob returns the etag with ""
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip",
"10080022",
'"0x8D8B8AD5B188839"',
),
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip",
"97292694",
'"0x8D8B8AD5B126C3B"',
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip",
"52952752",
"0x8D834F2EB31BDEC",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip",
"30945572",
"0x8D834F2EBA8D865",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip",
"155178106",
"0x8D87F67F4AEB960",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip",
"530196631",
"0x8D8244E90C15C07",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip",
"103456245",
"0x8D8244E92005849",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip",
"150359301",
"0x8D87F67E6CA4364",
),
],
)
def test_mind_url(url, content_length, etag):
url_headers = requests.head(url).headers
> assert url_headers["Content-Length"] == content_length
tests/data_validation/recommenders/datasets/test_mind.py:63:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = {'Transfer-Encoding': 'chunked', 'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0', 'x-ms-request-id': '21a0405e-b01e-001c-6b1a-e329b8000000', 'x-ms-version': '2009-09-19', 'Date': 'Wed, 31 Jul 2024 07:19:41 GMT'}
key = 'Content-Length'
def __getitem__(self, key):
> return self._store[key.lower()][1]
E KeyError: 'content-length'
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/structures.py:52: KeyError
_ test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip-150359301-0x8D87F67E6CA4364] _
url = 'https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip'
content_length = '150359301', etag = '0x8D87F67E6CA4364'
@pytest.mark.parametrize(
"url, content_length, etag",
[
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip",
"17372879",
'"0x8D8B8AD5B233930"',
), # NOTE: the z20 blob returns the etag with ""
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip",
"10080022",
'"0x8D8B8AD5B188839"',
),
(
"https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip",
"97292694",
'"0x8D8B8AD5B126C3B"',
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip",
"52952752",
"0x8D834F2EB31BDEC",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip",
"30945572",
"0x8D834F2EBA8D865",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip",
"155178106",
"0x8D87F67F4AEB960",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip",
"530196631",
"0x8D8244E90C15C07",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip",
"103456245",
"0x8D8244E92005849",
),
(
"https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip",
"150359301",
"0x8D87F67E6CA4364",
),
],
)
def test_mind_url(url, content_length, etag):
url_headers = requests.head(url).headers
> assert url_headers["Content-Length"] == content_length
tests/data_validation/recommenders/datasets/test_mind.py:63:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = {'Transfer-Encoding': 'chunked', 'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0', 'x-ms-request-id': '7c88778d-301e-006b-2e1a-e3acf9000000', 'x-ms-version': '2009-09-19', 'Date': 'Wed, 31 Jul 2024 07:19:41 GMT'}
key = 'Content-Length'
def __getitem__(self, key):
> return self._store[key.lower()][1]
E KeyError: 'content-length'
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/structures.py:52: KeyError
___________________________ test_download_mind_small ___________________________
tmp = '/tmp/pytest-of-root/pytest-0/tmpajcub2fk'
def test_download_mind_small(tmp):
> train_path, valid_path = download_mind(size="small", dest_path=tmp)
tests/data_validation/recommenders/datasets/test_mind.py:76:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
recommenders/datasets/mind.py:66: in download_mind
train_path = maybe_download(url=url_train, work_directory=path)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:56: in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:266: in call
raise attempt.get()
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:301: in get
six.reraise(self.value[0], self.value[1], self.value[2])
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/six.py:719: in reraise
raise value
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:251: in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
recommenders/datasets/download_utils.py:52: in maybe_download
r.raise_for_status()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Response [409]>
def raise_for_status(self):
"""Raises :class:`HTTPError`, if one occurred."""
http_error_msg = ""
if isinstance(self.reason, bytes):
# We attempt to decode utf-8 first because some servers
# choose to localize their reason strings. If the string
# isn't utf-8, we fall back to iso-8859-1 for all other
# encodings. (See PR #3538)
try:
reason = self.reason.decode("utf-8")
except UnicodeDecodeError:
reason = self.reason.decode("iso-8859-1")
else:
reason = self.reason
if 400 <= self.status_code < 500:
http_error_msg = (
f"{self.status_code} Client Error: {reason} for url: {self.url}"
)
elif 500 <= self.status_code < 600:
http_error_msg = (
f"{self.status_code} Server Error: {reason} for url: {self.url}"
)
if http_error_msg:
> raise HTTPError(http_error_msg, response=self)
E requests.exceptions.HTTPError: 409 Client Error: Public access is not permitted on this storage account. for url: https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/models.py:1024: HTTPError
------------------------------ Captured log call -------------------------------
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
___________________________ test_extract_mind_small ____________________________
tmp = '/tmp/pytest-of-root/pytest-0/tmpb28zqnxl'
def test_extract_mind_small(tmp):
> train_zip, valid_zip = download_mind(size="small", dest_path=tmp)
tests/data_validation/recommenders/datasets/test_mind.py:106:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
recommenders/datasets/mind.py:66: in download_mind
train_path = maybe_download(url=url_train, work_directory=path)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:56: in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:266: in call
raise attempt.get()
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:301: in get
six.reraise(self.value[0], self.value[1], self.value[2])
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/six.py:719: in reraise
raise value
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:251: in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
recommenders/datasets/download_utils.py:52: in maybe_download
r.raise_for_status()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Response [409]>
def raise_for_status(self):
"""Raises :class:`HTTPError`, if one occurred."""
http_error_msg = ""
if isinstance(self.reason, bytes):
# We attempt to decode utf-8 first because some servers
# choose to localize their reason strings. If the string
# isn't utf-8, we fall back to iso-8859-1 for all other
# encodings. (See PR #3538)
try:
reason = self.reason.decode("utf-8")
except UnicodeDecodeError:
reason = self.reason.decode("iso-8859-1")
else:
reason = self.reason
if 400 <= self.status_code < 500:
http_error_msg = (
f"{self.status_code} Client Error: {reason} for url: {self.url}"
)
elif 500 <= self.status_code < 600:
http_error_msg = (
f"{self.status_code} Server Error: {reason} for url: {self.url}"
)
if http_error_msg:
> raise HTTPError(http_error_msg, response=self)
E requests.exceptions.HTTPError: 409 Client Error: Public access is not permitted on this storage account. for url: https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/models.py:1024: HTTPError
------------------------------ Captured log call -------------------------------
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
___________________________ test_download_mind_large ___________________________
tmp_path = PosixPath('/tmp/pytest-of-root/pytest-0/test_download_mind_large0')
def test_download_mind_large(tmp_path):
> train_path, valid_path = download_mind(size="large", dest_path=tmp_path)
tests/data_validation/recommenders/datasets/test_mind.py:128:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
recommenders/datasets/mind.py:66: in download_mind
train_path = maybe_download(url=url_train, work_directory=path)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:56: in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:266: in call
raise attempt.get()
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:301: in get
six.reraise(self.value[0], self.value[1], self.value[2])
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/six.py:719: in reraise
raise value
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:251: in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
recommenders/datasets/download_utils.py:52: in maybe_download
r.raise_for_status()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Response [409]>
def raise_for_status(self):
"""Raises :class:`HTTPError`, if one occurred."""
http_error_msg = ""
if isinstance(self.reason, bytes):
# We attempt to decode utf-8 first because some servers
# choose to localize their reason strings. If the string
# isn't utf-8, we fall back to iso-8859-1 for all other
# encodings. (See PR #3538)
try:
reason = self.reason.decode("utf-8")
except UnicodeDecodeError:
reason = self.reason.decode("iso-8859-1")
else:
reason = self.reason
if 400 <= self.status_code < 500:
http_error_msg = (
f"{self.status_code} Client Error: {reason} for url: {self.url}"
)
elif 500 <= self.status_code < 600:
http_error_msg = (
f"{self.status_code} Server Error: {reason} for url: {self.url}"
)
if http_error_msg:
> raise HTTPError(http_error_msg, response=self)
E requests.exceptions.HTTPError: 409 Client Error: Public access is not permitted on this storage account. for url: https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/models.py:1024: HTTPError
------------------------------ Captured log call -------------------------------
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
___________________________ test_extract_mind_large ____________________________
tmp = '/tmp/pytest-of-root/pytest-0/tmpl09ornq9'
def test_extract_mind_large(tmp):
> train_zip, valid_zip = download_mind(size="large", dest_path=tmp)
tests/data_validation/recommenders/datasets/test_mind.py:136:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
recommenders/datasets/mind.py:66: in download_mind
train_path = maybe_download(url=url_train, work_directory=path)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:56: in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:266: in call
raise attempt.get()
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:301: in get
six.reraise(self.value[0], self.value[1], self.value[2])
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/six.py:719: in reraise
raise value
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:251: in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
recommenders/datasets/download_utils.py:52: in maybe_download
r.raise_for_status()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Response [409]>
def raise_for_status(self):
"""Raises :class:`HTTPError`, if one occurred."""
http_error_msg = ""
if isinstance(self.reason, bytes):
# We attempt to decode utf-8 first because some servers
# choose to localize their reason strings. If the string
# isn't utf-8, we fall back to iso-8859-1 for all other
# encodings. (See PR #3538)
try:
reason = self.reason.decode("utf-8")
except UnicodeDecodeError:
reason = self.reason.decode("iso-8859-1")
else:
reason = self.reason
if 400 <= self.status_code < 500:
http_error_msg = (
f"{self.status_code} Client Error: {reason} for url: {self.url}"
)
elif 500 <= self.status_code < 600:
http_error_msg = (
f"{self.status_code} Server Error: {reason} for url: {self.url}"
)
if http_error_msg:
> raise HTTPError(http_error_msg, response=self)
E requests.exceptions.HTTPError: 409 Client Error: Public access is not permitted on this storage account. for url: https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/models.py:1024: HTTPError
------------------------------ Captured log call -------------------------------
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
ERROR recommenders.datasets.download_utils:download_utils.py:51 Problem downloading https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
_____________________________ test_mind_utils_runs _____________________________
notebooks = {'als_deep_dive': '/mnt/azureml/cr/j/cbe7d84286194babb3a4e8f83d1aff91_2/exe/wd/examples/02_model_collaborative_filteri..._movielens': '/mnt/azureml/cr/j/cbe7d84286194babb3a4e8f83d1aff91_2/exe/wd/examples/06_benchmarks/movielens.ipynb', ...}
output_notebook = 'output.ipynb', kernel_name = 'python3'
tmp = '/tmp/pytest-of-root/pytest-0/tmp9ogydimj'
def test_mind_utils_runs(notebooks, output_notebook, kernel_name, tmp):
notebook_path = notebooks["mind_utils"]
> execute_notebook(
notebook_path,
output_notebook,
kernel_name=kernel_name,
parameters=dict(mind_type="small", word_embedding_dim=300),
)
tests/data_validation/examples/test_mind.py:9:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
recommenders/utils/notebook_utils.py:102: in execute_notebook
executed_notebook, _ = execute_preprocessor.preprocess(
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/nbconvert/preprocessors/execute.py:103: in preprocess
self.preprocess_cell(cell, resources, index)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/nbconvert/preprocessors/execute.py:124: in preprocess_cell
cell = self.execute_cell(cell, index, store_history=True)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/jupyter_core/utils/__init__.py:165: in wrapped
return loop.run_until_complete(inner)
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/asyncio/base_events.py:647: in run_until_complete
return future.result()
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/nbclient/client.py:1062: in async_execute_cell
await self._check_raise_for_error(cell, cell_index, exec_reply)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <nbconvert.preprocessors.execute.ExecutePreprocessor object at 0x14d0d2d39b80>
cell = {'cell_type': 'code', 'execution_count': 3, 'metadata': {'execution': {'iopub.status.busy': '2024-07-31T07:20:32.96568...lid'), clean_zip_file=False)\noutput_path = os.path.join(data_path, 'utils')\nos.makedirs(output_path, exist_ok=True)"}
cell_index = 4
exec_reply = {'buffers': [], 'content': {'ename': 'HTTPError', 'engine_info': {'engine_id': -1, 'engine_uuid': '8e3729aa-d321-4d83-...e, 'engine': '8e3729aa-d321-4d83-9075-37f91225a783', 'started': '2024-07-31T07:20:32.966079Z', 'status': 'error'}, ...}
async def _check_raise_for_error(
self, cell: NotebookNode, cell_index: int, exec_reply: dict[str, t.Any] | None
) -> None:
if exec_reply is None:
return None
exec_reply_content = exec_reply["content"]
if exec_reply_content["status"] != "error":
return None
cell_allows_errors = (not self.force_raise_errors) and (
self.allow_errors
or exec_reply_content.get("ename") in self.allow_error_names
or "raises-exception" in cell.metadata.get("tags", [])
)
await run_hook(
self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
)
if not cell_allows_errors:
> raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E tmpdir = TemporaryDirectory()
E data_path = tmpdir.name
E train_zip, valid_zip = download_mind(size=mind_type, dest_path=data_path)
E unzip_file(train_zip, os.path.join(data_path, 'train'), clean_zip_file=False)
E unzip_file(valid_zip, os.path.join(data_path, 'valid'), clean_zip_file=False)
E output_path = os.path.join(data_path, 'utils')
E os.makedirs(output_path, exist_ok=True)
E ------------------
E
E ----- stderr -----
E Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
E ----- stderr -----
E Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
E ----- stderr -----
E Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
E ----- stderr -----
E Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
E ----- stderr -----
E Problem downloading https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
E ------------------
E
E [0;31m---------------------------------------------------------------------------[0m
E [0;31mHTTPError[0m Traceback (most recent call last)
E Cell [0;32mIn[3], line 3[0m
E [1;32m 1[0m tmpdir [38;5;241m=[39m TemporaryDirectory()
E [1;32m 2[0m data_path [38;5;241m=[39m tmpdir[38;5;241m.[39mname
E [0;32m----> 3[0m train_zip, valid_zip [38;5;241m=[39m [43mdownload_mind[49m[43m([49m[43msize[49m[38;5;241;43m=[39;49m[43mmind_type[49m[43m,[49m[43m [49m[43mdest_path[49m[38;5;241;43m=[39;49m[43mdata_path[49m[43m)[49m
E [1;32m 4[0m unzip_file(train_zip, os[38;5;241m.[39mpath[38;5;241m.[39mjoin(data_path, [38;5;124m'[39m[38;5;124mtrain[39m[38;5;124m'[39m), clean_zip_file[38;5;241m=[39m[38;5;28;01mFalse[39;00m)
E [1;32m 5[0m unzip_file(valid_zip, os[38;5;241m.[39mpath[38;5;241m.[39mjoin(data_path, [38;5;124m'[39m[38;5;124mvalid[39m[38;5;124m'[39m), clean_zip_file[38;5;241m=[39m[38;5;28;01mFalse[39;00m)
E
E File [0;32m/mnt/azureml/cr/j/cbe7d84286194babb3a4e8f83d1aff91_2/exe/wd/recommenders/datasets/mind.py:66[0m, in [0;36mdownload_mind[0;34m(size, dest_path)[0m
E [1;32m 64[0m url_train, url_valid [38;5;241m=[39m URL_MIND[size]
E [1;32m 65[0m [38;5;28;01mwith[39;00m download_path(dest_path) [38;5;28;01mas[39;00m path:
E [0;32m---> 66[0m train_path [38;5;241m=[39m [43mmaybe_download[49m[43m([49m[43murl[49m[38;5;241;43m=[39;49m[43murl_train[49m[43m,[49m[43m [49m[43mwork_directory[49m[38;5;241;43m=[39;49m[43mpath[49m[43m)[49m
E [1;32m 67[0m valid_path [38;5;241m=[39m maybe_download(url[38;5;241m=[39murl_valid, work_directory[38;5;241m=[39mpath)
E [1;32m 68[0m [38;5;28;01mreturn[39;00m train_path, valid_path
E
E File [0;32m/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:56[0m, in [0;36mretry.<locals>.wrap.<locals>.wrapped_f[0;34m(*args, **kw)[0m
E [1;32m 54[0m [38;5;129m@six[39m[38;5;241m.[39mwraps(f)
E [1;32m 55[0m [38;5;28;01mdef[39;00m [38;5;21mwrapped_f[39m([38;5;241m*[39margs, [38;5;241m*[39m[38;5;241m*[39mkw):
E [0;32m---> 56[0m [38;5;28;01mreturn[39;00m [43mRetrying[49m[43m([49m[38;5;241;43m*[39;49m[43mdargs[49m[43m,[49m[43m [49m[38;5;241;43m*[39;49m[38;5;241;43m*[39;49m[43mdkw[49m[43m)[49m[38;5;241;43m.[39;49m[43mcall[49m[43m([49m[43mf[49m[43m,[49m[43m [49m[38;5;241;43m*[39;49m[43margs[49m[43m,[49m[43m [49m[38;5;241;43m*[39;49m[38;5;241;43m*[39;49m[43mkw[49m[43m)[49m
E
E File [0;32m/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:266[0m, in [0;36mRetrying.call[0;34m(self, fn, *args, **kwargs)[0m
E [1;32m 263[0m [38;5;28;01mif[39;00m [38;5;28mself[39m[38;5;241m.[39mstop(attempt_number, delay_since_first_attempt_ms):
E [1;32m 264[0m [38;5;28;01mif[39;00m [38;5;129;01mnot[39;00m [38;5;28mself[39m[38;5;241m.[39m_wrap_exception [38;5;129;01mand[39;00m attempt[38;5;241m.[39mhas_exception:
E [1;32m 265[0m [38;5;66;03m# get() on an attempt with an exception should cause it to be raised, but raise just in case[39;00m
E [0;32m--> 266[0m [38;5;28;01mraise[39;00m [43mattempt[49m[38;5;241;43m.[39;49m[43mget[49m[43m([49m[43m)[49m
E [1;32m 267[0m [38;5;28;01melse[39;00m:
E [1;32m 268[0m [38;5;28;01mraise[39;00m RetryError(attempt)
E
E File [0;32m/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:301[0m, in [0;36mAttempt.get[0;34m(self, wrap_exception)[0m
E [1;32m 299[0m [38;5;28;01mraise[39;00m RetryError([38;5;28mself[39m)
E [1;32m 300[0m [38;5;28;01melse[39;00m:
E [0;32m--> 301[0m [43msix[49m[38;5;241;43m.[39;49m[43mreraise[49m[43m([49m[38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43mvalue[49m[43m[[49m[38;5;241;43m0[39;49m[43m][49m[43m,[49m[43m [49m[38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43mvalue[49m[43m[[49m[38;5;241;43m1[39;49m[43m][49m[43m,[49m[43m [49m[38;5;28;43mself[39;49m[38;5;241;43m.[39;49m[43mvalue[49m[43m[[49m[38;5;241;43m2[39;49m[43m][49m[43m)[49m
E [1;32m 302[0m [38;5;28;01melse[39;00m:
E [1;32m 303[0m [38;5;28;01mreturn[39;00m [38;5;28mself[39m[38;5;241m.[39mvalue
E
E File [0;32m/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/six.py:719[0m, in [0;36mreraise[0;34m(tp, value, tb)[0m
E [1;32m 717[0m [38;5;28;01mif[39;00m value[38;5;241m.[39m__traceback__ [38;5;129;01mis[39;00m [38;5;129;01mnot[39;00m tb:
E [1;32m 718[0m [38;5;28;01mraise[39;00m value[38;5;241m.[39mwith_traceback(tb)
E [0;32m--> 719[0m [38;5;28;01mraise[39;00m value
E [1;32m 720[0m [38;5;28;01mfinally[39;00m:
E [1;32m 721[0m value [38;5;241m=[39m [38;5;28;01mNone[39;00m
E
E File [0;32m/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/retrying.py:251[0m, in [0;36mRetrying.call[0;34m(self, fn, *args, **kwargs)[0m
E [1;32m 248[0m [38;5;28mself[39m[38;5;241m.[39m_before_attempts(attempt_number)
E [1;32m 250[0m [38;5;28;01mtry[39;00m:
E [0;32m--> 251[0m attempt [38;5;241m=[39m Attempt([43mfn[49m[43m([49m[38;5;241;43m*[39;49m[43margs[49m[43m,[49m[43m [49m[38;5;241;43m*[39;49m[38;5;241;43m*[39;49m[43mkwargs[49m[43m)[49m, attempt_number, [38;5;28;01mFalse[39;00m)
E [1;32m 252[0m [38;5;28;01mexcept[39;00m:
E [1;32m 253[0m tb [38;5;241m=[39m sys[38;5;241m.[39mexc_info()
E
E File [0;32m/mnt/azureml/cr/j/cbe7d84286194babb3a4e8f83d1aff91_2/exe/wd/recommenders/datasets/download_utils.py:52[0m, in [0;36mmaybe_download[0;34m(url, filename, work_directory, expected_bytes)[0m
E [1;32m 50[0m [38;5;28;01melse[39;00m:
E [1;32m 51[0m log[38;5;241m.[39merror([38;5;124mf[39m[38;5;124m"[39m[38;5;124mProblem downloading [39m[38;5;132;01m{[39;00murl[38;5;132;01m}[39;00m[38;5;124m"[39m)
E [0;32m---> 52[0m [43mr[49m[38;5;241;43m.[39;49m[43mraise_for_status[49m[43m([49m[43m)[49m
E [1;32m 53[0m [38;5;28;01melse[39;00m:
E [1;32m 54[0m log[38;5;241m.[39minfo([38;5;124mf[39m[38;5;124m"[39m[38;5;124mFile [39m[38;5;132;01m{[39;00mfilepath[38;5;132;01m}[39;00m[38;5;124m already downloaded[39m[38;5;124m"[39m)
E
E File [0;32m/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/requests/models.py:1024[0m, in [0;36mResponse.raise_for_status[0;34m(self)[0m
E [1;32m 1019[0m http_error_msg [38;5;241m=[39m (
E [1;32m 1020[0m [38;5;124mf[39m[38;5;124m"[39m[38;5;132;01m{[39;00m[38;5;28mself[39m[38;5;241m.[39mstatus_code[38;5;132;01m}[39;00m[38;5;124m Server Error: [39m[38;5;132;01m{[39;00mreason[38;5;132;01m}[39;00m[38;5;124m for url: [39m[38;5;132;01m{[39;00m[38;5;28mself[39m[38;5;241m.[39murl[38;5;132;01m}[39;00m[38;5;124m"[39m
E [1;32m 1021[0m )
E [1;32m 1023[0m [38;5;28;01mif[39;00m http_error_msg:
E [0;32m-> 1024[0m [38;5;28;01mraise[39;00m HTTPError(http_error_msg, response[38;5;241m=[39m[38;5;28mself[39m)
E
E [0;31mHTTPError[0m: 409 Client Error: Public access is not permitted on this storage account. for url: https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/nbclient/client.py:918: CellExecutionError
=============================== warnings summary ===============================
../../../../../../../azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/jupyter_client/connect.py:22
/azureml-envs/azureml_78455e6ba0500ddf6d4749f1a3e8d1f3/lib/python3.9/site-packages/jupyter_client/connect.py:22: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
given by the platformdirs library. To remove this warning and
see the appropriate new directories, set the environment variable
`JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
The use of platformdirs will be the default in `jupyter_core` v6
from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================== slowest durations ===============================
233.54s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[20m-20000263-27278-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
165.79s call tests/functional/examples/test_notebooks_python.py::test_cornac_bpr_functional[1m-expected_values0]
115.75s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[10m-10000054-10681-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
42.27s call tests/data_validation/examples/test_mind.py::test_mind_utils_values
25.94s call tests/smoke/examples/test_notebooks_python.py::test_lightgbm_quickstart_smoke
18.44s call tests/data_validation/examples/test_wikidata.py::test_wikidata_runs
15.79s call tests/data_validation/examples/test_mind.py::test_mind_utils_runs
14.12s call tests/smoke/examples/test_notebooks_python.py::test_cornac_bpr_smoke
13.74s call tests/data_validation/recommenders/datasets/test_mind.py::test_download_mind_small
12.15s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[1m-1000209-3883-1-Toy Story (1995)-Animation|Children's|Comedy-1995]
11.75s call tests/data_validation/recommenders/datasets/test_mind.py::test_download_mind_large
11.64s call tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_small
9.98s call tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_large
7.90s call tests/data_validation/examples/test_wikidata.py::test_wikidata_values
5.15s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[20m-27278-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
4.99s call tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[20m]
2.38s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[10m-10681-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
2.25s call tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[10m]
2.11s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[100k-100000-1682-1-Toy Story (1995)-Animation|Children's|Comedy-1995]
0.99s call tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_demo
0.71s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[100k-1682-1-Toy Story (1995)-Animation|Children's|Comedy-1995]
0.68s call tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[1m]
0.56s call tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[100k]
0.54s call tests/data_validation/recommenders/datasets/test_mind.py::test_download_mind_demo
0.54s call tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[1m-3883-1-Toy Story (1995)-Animation|Children's|Comedy-1995]
0.32s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip-103456245-0x8D8244E92005849]
0.32s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip-155178106-0x8D87F67F4AEB960]
0.31s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip-30945572-0x8D834F2EBA8D865]
0.31s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip-52952752-0x8D834F2EB31BDEC]
0.29s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip-530196631-0x8D8244E90C15C07]
0.29s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip-150359301-0x8D87F67E6CA4364]
0.09s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[20m-27278-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
0.09s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[20m]
0.08s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[20m-20000263-27278-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
0.07s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_dev.zip-10080022-"0x8D8B8AD5B188839"]
0.05s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_train.zip-17372879-"0x8D8B8AD5B233930"]
0.05s call tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://recodatasets.z20.web.core.windows.net/newsrec/MINDdemo_utils.zip-97292694-"0x8D8B8AD5B126C3B"]
0.04s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_download_and_extract_movielens[10m]
0.04s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_load_item_df[10m-10681-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
0.04s teardown tests/data_validation/recommenders/datasets/test_movielens.py::test_load_pandas_df[10m-10000054-10681-1-Toy Story (1995)-Adventure|Animation|Children|Comedy|Fantasy-1995]
(62 durations < 0.005s hidden. Use -vv to show these durations.)
=========================== short test summary info ============================
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip-52952752-0x8D834F2EB31BDEC]
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip-30945572-0x8D834F2EBA8D865]
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDsmall_utils.zip-155178106-0x8D87F67F4AEB960]
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip-530196631-0x8D8244E90C15C07]
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip-103456245-0x8D8244E92005849]
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_mind_url[https://mind201910small.blob.core.windows.net/release/MINDlarge_utils.zip-150359301-0x8D87F67E6CA4364]
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_download_mind_small
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_small
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_download_mind_large
FAILED tests/data_validation/recommenders/datasets/test_mind.py::test_extract_mind_large
FAILED tests/data_validation/examples/test_mind.py::test_mind_utils_runs - nb...
============= 11 failed, 23 passed, 1 warning in 724.17s (0:12:04) =============
This has been fixed
Description
The VMs for the tests are not even starting:
In which platform does it happen?
How do we replicate the issue?
See example: https://github.com/recommenders-team/recommenders/actions/runs/10406895552/job/28821110978
Expected behavior (i.e. solution)
Willingness to contribute
Other Comments
FYI @SimonYansenZhao