cdisc-org / cdisc-rules-engine

Open source offering of the cdisc rules engine
MIT License
45 stars 12 forks source link

Rule Engine putting info logs even when the logs are disabled #731

Open RamilCDISC opened 2 weeks ago

RamilCDISC commented 2 weeks ago

While running the valdiaiton command in the terminal, if the log level is not explicitly mentioned then it is set as disabled. But it is noticed that the engine still puts INFO logs even though the logs are disabled.

To Reproduce run the following command in a terminal to reproduce the issue:

python core.py validate -s sdtmig -v 3.4 -dp tests/resources/datasets/ae.xpt

The engine will output INFO logs. Following is a small part of the logs outputted by the command, for reference

[INFO 2024-06-19 14:38:41,898 - console_logger.py:41] - Downloading dataset from storage. dataset_name=tests/resources/datasets/ae.xpt, wrapped function=get_variables_metadata [INFO 2024-06-19 14:38:41,899 - console_logger.py:41] - Dataset "tests/resources/datasets/ae.xpt" was found in cache. cache_key=tests/resources/datasets/ae.xpt_variables_metadata [INFO 2024-06-19 14:38:41,899 - console_logger.py:41] - Downloaded dataset. dataset=<cdisc_rules_engine.models.dataset.pandas_dataset.PandasDataset object at 0x11f9c2b30> [INFO 2024-06-19 14:38:41,899 - console_logger.py:41] - is_suitable_for_validation. rule id=CORE-000162, domain=AE, result=False [INFO 2024-06-19 14:38:41,899 - console_logger.py:41] - Skipped domain AE. [███████████████████████████████████-] 99%[INFO 2024-06-19 14:38:41,909 - console_logger.py:41] - Using PANDAS dataset implementation [INFO 2024-06-19 14:38:41,909 - console_logger.py:41] - Using PANDAS dataset implementation [INFO 2024-06-19 14:38:41,909 - console_logger.py:41] - Validating domain AE. rule={'core_id': 'CORE-000059', 'author': 'CDISC', 'reference': [[{'Citations': [{'Cited_Guidance': 'RPPLDY is not be used in human clinical trials.', 'Document': 'SDTM v2.0', 'Section': 'Timing'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0632', 'Version': '1'}, 'Version': '2.0'}]], 'sensitivity': 'Dataset', 'executability': 'fully executable', 'description': 'Raise an error when RPPLDY is present.', 'authorities': [{'Organization': 'CDISC', 'Standards': [{'Name': 'SDTMIG', 'References': [{'Citations': [{'Cited_Guidance': 'RPPLDY is not be used in human clinical trials.', 'Document': 'SDTM v2.0', 'Section': 'Timing'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0632', 'Version': '1'}, 'Version': '2.0'}], 'Version': '3.4'}]}], 'standards': [{'Name': 'SDTMIG', 'Version': '3.4'}], 'classes': {'Include': ['ALL']}, 'domains': {'Include': ['ALL']}, 'rule_type': 'Record Data', 'conditions': <cdisc_rules_engine.models.rule_conditions.condition_composite.ConditionComposite object at 0x117a22a10>, 'actions': [{'name': 'generate_dataset_error_objects', 'params': {'message': 'RPPLDY is present.'}}]}. dataset_path=tests/resources/datasets/ae.xpt. datasets=[{'domain': 'AE', 'filename': 'ae.xpt', 'full_path': 'tests/resources/datasets/ae.xpt', 'length': 74, 'label': 'Adverse Events', 'size': 38080, 'modification_date': '2020-08-21T09:14:28', 'temp_filename': None}]. [INFO 2024-06-19 14:38:41,909 - console_logger.py:41] - is_suitable_for_validation. rule id=CORE-000059, domain=AE, result=True [INFO 2024-06-19 14:38:41,909 - console_logger.py:41] - Downloading dataset from storage. dataset_name=tests/resources/datasets/ae.xpt, wrapped function=get_dataset [INFO 2024-06-19 14:38:41,910 - console_logger.py:41] - Dataset "tests/resources/datasets/ae.xpt" was found in cache. cache_key=tests/resources/datasets/ae.xpt_contents [INFO 2024-06-19 14:38:41,910 - console_logger.py:41] - Downloaded dataset. dataset=<cdisc_rules_engine.models.dataset.pandas_dataset.PandasDataset object at 0x1272c1f30> [INFO 2024-06-19 14:38:41,910 - console_logger.py:41] - Using dataset build by: <class 'cdisc_rules_engine.dataset_builders.contents_dataset_builder.ContentsDatasetBuilder'> [INFO 2024-06-19 14:38:41,910 - console_logger.py:41] - is_relationship_dataset. domain=AE, result=False [INFO 2024-06-19 14:38:41,911 - console_logger.py:41] - Validated domain AE. Result = [] [INFO 2024-06-19 14:38:41,920 - console_logger.py:41] - Using PANDAS dataset implementation [INFO 2024-06-19 14:38:41,920 - console_logger.py:41] - Using PANDAS dataset implementation [INFO 2024-06-19 14:38:41,920 - console_logger.py:41] - Validating domain AE. rule={'core_id': 'CORE-000180', 'author': 'CDISC', 'reference': [[{'Citations': [{'Cited_Guidance': 'Each domain dataset is distinguished by a unique, two-character code', 'Document': 'IG v3.4', 'Section': '2.2'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0308', 'Version': '1'}, 'Version': '2.0'}], [{'Citations': [{'Cited_Guidance': 'Each domain dataset is distinguished by a unique, two-character code', 'Document': 'IG v3.2', 'Section': '2.2'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0308', 'Version': '1'}, 'Version': '2.0'}], [{'Citations': [{'Cited_Guidance': 'Each domain dataset is distinguished by a unique, two-character code', 'Document': 'IG v3.3', 'Section': '2.2'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0308', 'Version': '1'}, 'Version': '2.0'}]], 'sensitivity': 'Domain', 'executability': 'fully executable', 'description': 'Raise an error when DOMAIN value length is not equal 2.', 'authorities': [{'Organization': 'CDISC', 'Standards': [{'Name': 'SDTMIG', 'References': [{'Citations': [{'Cited_Guidance': 'Each domain dataset is distinguished by a unique, two-character code', 'Document': 'IG v3.4', 'Section': '2.2'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0308', 'Version': '1'}, 'Version': '2.0'}], 'Version': '3.4'}, {'Name': 'SDTMIG', 'References': [{'Citations': [{'Cited_Guidance': 'Each domain dataset is distinguished by a unique, two-character code', 'Document': 'IG v3.2', 'Section': '2.2'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0308', 'Version': '1'}, 'Version': '2.0'}], 'Version': '3.2'}, {'Name': 'SDTMIG', 'References': [{'Citations': [{'Cited_Guidance': 'Each domain dataset is distinguished by a unique, two-character code', 'Document': 'IG v3.3', 'Section': '2.2'}], 'Origin': 'SDTM and SDTMIG Conformance Rules', 'Rule_Identifier': {'Id': 'CG0308', 'Version': '1'}, 'Version': '2.0'}], 'Version': '3.3'}]}], 'standards': [{'Name': 'SDTMIG', 'Version': '3.4'}, {'Name': 'SDTMIG', 'Version': '3.2'}, {'Name': 'SDTMIG', 'Version': '3.3'}], 'classes': {'Exclude': ['RELATIONSHIP']}, 'domains': {'Exclude': ['AP--']}, 'rule_type': 'Record Data', 'conditions': <cdisc_rules_engine.models.rule_conditions.condition_composite.ConditionComposite object at 0x10055ab00>, 'actions': [{'name': 'generate_dataset_error_objects', 'params': {'message': 'Domain value length is not equal 2.'}}]}. dataset_path=tests/resources/datasets/ae.xpt. datasets=[{'domain': 'AE', 'filename': 'ae.xpt', 'full_path': 'tests/resources/datasets/ae.xpt', 'length': 74, 'label': 'Adverse Events', 'size': 38080, 'modification_date': '2020-08-21T09:14:28', 'temp_filename': None}]. [INFO 2024-06-19 14:38:41,921 - console_logger.py:41] - Downloading dataset from storage. dataset_name=tests/resources/datasets/ae.xpt, wrapped function=get_variables_metadata [INFO 2024-06-19 14:38:41,921 - console_logger.py:41] - Dataset "tests/resources/datasets/ae.xpt" was found in cache. cache_key=tests/resources/datasets/ae.xpt_variables_metadata [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - Downloaded dataset. dataset=<cdisc_rules_engine.models.dataset.pandas_dataset.PandasDataset object at 0x127ac21a0> [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - is_suitable_for_validation. rule id=CORE-000180, domain=AE, result=True [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - Downloading dataset from storage. dataset_name=tests/resources/datasets/ae.xpt, wrapped function=get_dataset [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - Dataset "tests/resources/datasets/ae.xpt" was found in cache. cache_key=tests/resources/datasets/ae.xpt_contents [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - Downloaded dataset. dataset=<cdisc_rules_engine.models.dataset.pandas_dataset.PandasDataset object at 0x127ac2b90> [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - Using dataset build by: <class 'cdisc_rules_engine.dataset_builders.contents_dataset_builder.ContentsDatasetBuilder'> [INFO 2024-06-19 14:38:41,922 - console_logger.py:41] - is_relationship_dataset. domain=AE, result=False [INFO 2024-06-19 14:38:41,923 - console_logger.py:41] - Validated domain AE. Result = [] [████████████████████████████████████] 100%

SFJohnson24 commented 2 weeks ago

The issue is related to run_validation where it gets into the multithreaded Pool--the logger is a global variable and does not move into these execution contexts. We need to feed the logger into the pool with the prior config to continue the logging at the chosen setting for validations.