Open triatic opened 3 days ago
CLI outputs do not generate non unicode characters, if you have any such example please share, we will investigate.
ASCII encoding from oci compute instance list
. Note, this json output contained non-ascii characters which were not unicode.
C:\>php -r "var_dump(mb_detect_encoding(shell_exec('oci compute instance list --compartment-id ocid1.tenancy.oc1..removed')));"
string(5) "ASCII"
Can you please share the output of the oci commands or the start of it shows data.., is there any errors or warning ?
{
"data": {
"items": [
{
Which python version do you use ?
I'm using the newest Windows oci msi package downloaded from Github, which bundles Python. The json is formatted correctly, other than the non unicode characters.
The line that breaks things is this:
"processor-description": "3.0 GHz Ampere® Altra™",
Start of output:
C:\>oci compute instance list --compartment-id ocid1.tenancy.oc1..removed
{
"data": [
{
"agent-config": {
... etc
I asked Python version :) I tried to run and didn't see any non ascii, I will wait for OCI CLI team to respond
I asked Python version :)
Whatever the MSI package installs? I can see python38.dll in the installation directory, and I do not have Python globally installed in Windows.
Thank you for that
I tried to run and didn't see any non ascii
"3.0 GHz Ampere® Altra™" contains non ASCII characters, the ® and ™ characters. The problem for me is that they are also not produced in unicode by oci as required by json spec.
Understood, it is the processor type, Nupur, please take it with OCI CLI team "processor-description": "3.0 GHz Ampere® Altra™"
@adizohar just to clarify, are you are saying only ASCII characters should be returned by oci's json output, and the expected fix is to remove ® and ™ from the json output?
No, I don't believe this is a bug or an issue that needs to be fixed. I have asked the OCI CLI team to take a look. In the meantime, you can filter out the non-ASCII characters before ingesting the JSON, or use the OCI Python SDK to read and handle these characters.
Ok. At the moment I am converting oci's output from ASCII to UTF-8 where the ® and ™ characters are present, which prevents json_decode()
from failing.
According to https://thesmsworks.co.uk/unicode-detector ® and ™ are unicode characters.
According to https://thesmsworks.co.uk/unicode-detector ® and ™ are unicode characters.
They can be encoded in unicode. But OCI CLI encodes them in Windows-1252 which is not valid for json: https://en.wikipedia.org/wiki/Windows-1252
Can you please share the output recieved (without any further parsing) from oci-cli when you trigger this command (or via a script). It will be more clear then.
Are you happy for me to edit out unique identifiers from the output?
When executing commands such as
oci compute instance list
,json_decode()
in php can fail when decoding the json output. This is because the json output can contain non-ascii characters, and it is not unicode as required by specification.OCI version 3.50.0 (msi package) Windows 10 version 10.0.19045.5011