aws / aws-xray-sdk-python

AWS X-Ray SDK for the Python programming language
Apache License 2.0
329 stars 143 forks source link

No way to avoid loading the sampling_rule.json file #108

Open christianbeland opened 6 years ago

christianbeland commented 6 years ago

Would it be possible to only load this file if the LocalSampler is initialized?

Moving

with open(resource_filename(__name__, 'sampling_rule.json')) as f: local_sampling_rule = json.load(f) to def __init__(self, rules=local_sampling_rule):

Originally posted by @christianbeland in https://github.com/aws/aws-xray-sdk-python/issues/105#issuecomment-428700941

haotianw465 commented 6 years ago

This default local sampling rules definition prevents user application from oversampling when there is an issue of getting sampling rules from X-Ray service. But a file might not be necessary here. A global constants might be more convenient instead of a data file. This will be a breaking change for users who directly modify the file for customized fallback rules so we cannot make such change until we have a new major version release. What is the issue you are having with loading this file by default?

christianbeland commented 6 years ago

I currently have a few shared libraries that include some x-ray instrumentation. These libraries are used by multiple projects, some built using pyinstaller. This would requires to modify each of these project to make sure the sampling_rule.json file is collected (as described https://github.com/aws/aws-xray-sdk-python/issues/105).

I do not find this solution appropriate since it would require them to have a specific mention of the aws-xray-python-sdk in their project (currently only a transitive dependency).

For now, the only solution I found was to dynamically load xray_recorder, XRaySessionMaker, LambdaContext and LocalSampler using importlib. It allows to avoid automatically trying to read the file if x-ray is not explicitly activated for a specific project. But relying on importlib for this is not ideal.

haotianw465 commented 6 years ago

Got it. The extra data files indeed are not optimal in case of custom packaging. A non-breaking workaround is to have the recorder fallback to a sampling_rule.py module that contains a global dictionary of default sampling rules definition when it cannot load the .json data file. But a similar change also needs to apply to https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/resources/aws_para_whitelist.json. We will put this to our backlog. A PR is always welcome