open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.67k stars 571 forks source link

Fix class BoundedAttributes to have RLock rather than Lock #3859

Closed hyoinandout closed 1 month ago

hyoinandout commented 2 months ago

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

The change fixes the class BoundedAttributes to have RLock rather than Lock. Fixes #3858 (issue). Motivation of this PR is that a deadlock symptom was observed while using the Opentelemetry _logs API. No additional dependencies are required for this change.

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

It seems that a dedicated test case is required.

Does This PR Require a Contrib Repo Change?

Answer the following question based on these examples of changes that would require a Contrib Repo Change:

Checklist:

linux-foundation-easycla[bot] commented 2 months ago

CLA Signed

The committers listed above are authorized under a signed CLA.

xrmx commented 2 months ago

@hyoinandout please rebase and add a changelog to have green tests

hyoinandout commented 2 months ago

@xrmx Thank you for giving me a heads up. I rebased my branch on top of main branch and added a changelog for this PR.

xrmx commented 2 months ago

@hyoinandout The test you added is failling on a specific combination

hyoinandout commented 2 months ago

@xrmx I will have my eyes on it, but since I am not an expert of this build environment, I'm not sure that I will find the reason. Could you share your opinion why the test is failing in such specific environment?

emdneto commented 1 month ago

From the pipelines, the test is failing on the build Windows 2019 check. From the logs, it seems that thread 2 reads the value from bdict before Thread 1 modifies it, and then Thread 2 modifies it before Thread 1 writes its doubled value. For this reason, the final value might not be the expected 4x as you want. It seems the behavior in Windows is unstable

ocelotl commented 1 month ago

@lzchen I think we should probably skip this test case for the failing Windows environment, what do you think?

xrmx commented 1 month ago

The test is racy though, the lock is in __setitem__ and does not cover the reading of the actual value in __getitem__. So maybe just set a fixed value and assert that all elements have been set. We are testing that there was a deadlock and now it's gone so it should be fine.

hyoinandout commented 1 month ago

@xrmx Now I fully get the point. I really appreciate your explanations for it and just have pushed a commit to test it like it is.

hyoinandout commented 1 month ago

See the comment for the deadlock which motivated this PR.