open-telemetry / opentelemetry-cpp

The OpenTelemetry C++ Client
https://opentelemetry.io/
Apache License 2.0
811 stars 391 forks source link

Source Files contain Unicode Text #2706

Closed perhapsmaple closed 1 week ago

perhapsmaple commented 1 week ago

Describe your environment

Branch: main (commit 25738f391bf4f7a583690d4a5c827d3803d2b6c3)

Steps to reproduce

find . -type f -name "*.h" -exec file {} \; | grep UTF-8
find . -type f -name "*.cc" -exec file {} \; | grep UTF-8

What is the expected behavior? Source files are expected to be ASCII encoded except when unicode characters are required for tests.

What is the actual behavior?

harish@Harishs-MacBook-Air opentelemetry-cpp % find . -type f -name "*.h" -exec file {} \; | grep UTF-8  
./api/include/opentelemetry/context/context.h: C++ source text, Unicode text, UTF-8 (with BOM) text
./api/include/opentelemetry/context/runtime_context.h: C++ source text, Unicode text, UTF-8 (with BOM) text
./api/include/opentelemetry/baggage/baggage.h: C++ source text, Unicode text, UTF-8 (with BOM) text

harish@Harishs-MacBook-Air opentelemetry-cpp % find . -type f -name "*.cc" -exec file {} \; | grep UTF-8
./ext/test/http/url_parser_test.cc: c program text, Unicode text, UTF-8 text
./sdk/test/metrics/instrument_metadata_validator_test.cc: c program text, Unicode text, UTF-8 text
./opentracing-shim/src/span_shim.cc: C++ source text, Unicode text, UTF-8 text

Additional context The headers listed above all have a BOM character at the start of the file, and span_shim.cc has a unicode character in a comment. I currently use an in-house build system that is built on flex which has trouble parsing UTF-8 encoded files. I think we should convert all source files to use ASCII encoding unless required. I would be happy to contribute a PR if required.

marcalff commented 1 week ago

Thanks for the report. PR welcome.