RedHatInsights / insights-core

Insights Core is a data collection and processing framework used by Red Hat Insights
https://cloud.redhat.com/insights
Apache License 2.0
153 stars 181 forks source link

DatasourceProvider reads file-content in "bytes" format and causing parsing to fail #2970

Closed xiangce closed 3 years ago

xiangce commented 3 years ago

Original Exception raised by the parser

  File "/work/insights/insights-core/insights/core/__init__.py", line 95, in _handle_content
    self.parse_content(context.content)
  File "/work/insights/insights-core/insights/parsers/ld_library_path.py", line 56, in parse_content
    user, _, raw = [s.strip() for s in line.partition(' ')]
AttributeError: 'int' object has no attribute 'partition'

Collected meta_data:

# cat /tmp/insights-vm37-39.gsslab.pek2.redhat.com-20210308120914/meta_data/insights.specs.Specs.ld_library_path_of_user.json | python3 -m json.tool
{
    "exec_time": 0.0009329319000244141,
    "ser_time": 0.0006611347198486328,
    "errors": [],
    "name": "insights.specs.Specs.ld_library_path_of_user",
    "results": {
        "object": {
            "relative_path": "insights_commands/echo_user_LD_LIBRARY_PATH"
        },
        "type": "insights.core.spec_factory.DatasourceProvider"
    }
}

The content:

# insights-inspect insights.specs.Specs.ld_library_path_of_user /tmp/insights-vm37-39.gsslab.pek2.redhat.com-20210308120914

In [1]: type(ld_library_path_of_user)
Out[1]: insights.core.spec_factory.SerializedRawOutputProvider

In [2]: type(ld_library_path_of_user.content)
Out[2]: bytes

In [3]: ld_library_path_of_user.content
Out[3]: b'rh1adm /usr/sap/RH1/SYS/exe/run:/usr/sap/RH1/SYS/exe/uc/linuxx86_64:/sapdb/clients/RH1/lib\nsr1adm /usr/sap/SR1/HDB02/exe/krb5/lib/krb5/plugins/preauth:/usr/sap/SR1/HDB02/exe/krb5/lib:/usr/sap/SR1/HDB02/exe:/usr/sap/SR1/HDB02/exe/Python/lib:/usr/sap/SR1/HDB02/exe/filter:/usr/sap/SR1/HDB02/exe/dat_bin_dir:/usr/sap/SR1/HDB02/exe/plugins/afl:/usr/sap/SR1/HDB02/exe/plugins/lcapps:/usr/sap/SR1/HDB02/exe/plugins/repository:/usr/sap/SR1/HDB02/exe/plugins/epmmds:/usr/sap/SR1/SYS/global/hdb/federation:/usr/sap/SR1/SYS/global/hdb/plugins/3rd_party_libs\nrh2adm /usr/sap/RH2/SYS/exe/run:/usr/sap/RH2/SYS/exe/uc/linuxx86_64:/sapdb/clients/RH2/lib'

I'm unsure if it's OK to modify the load method to read the file to a list directly:

self.loaded = True
with open(self.path, 'r') as f:
        return f.readlines()

@csams , @bfahr - would you please have a look?

csams commented 3 years ago

Hey @xiangce we can change the SerializedRawOutputProvider in the deserializer to a SerializedOutputProvider.

The raw providers exist in case we ever need to collect binary data instead of text. I defaulted datasource providers to raw binary since it was most general, but we really need datasource functions to be able to return different objects for text and raw bytes.

bfahr commented 3 years ago

Tested with mocked cloud_cfg. Results before change:

$ insights cat insights.specs.Specs.cloud_cfg insights-devboxseven-20210323173222.tar.gz 
SerializedRawOutputProvider("'/tmp/insights-9yjamkj8/insights-devboxseven-20210323173222/data/etc/cloud/cloud.cfg'")

After change:

$ insights cat insights.specs.Specs.cloud_cfg insights-devboxseven-20210323173222.tar.gz 
SerializedOutputProvider("'/tmp/insights-g2h3aonk/insights-devboxseven-20210323173222/data/etc/cloud/cloud.cfg'")
{"version": 1, "config": [{"subnets": [{"type": "dhcp"}, {"type": "dhcp6"}], "type": "physical", "name": "eth0"}]}

$ insights inspect insights.specs.Specs.cloud_cfg insights-devboxseven-20210323173222.tar.gz 

IPython Console Usage Info:

Enter 'cloud_cfg.' and tab to get a list of properties 
Example:
In [1]: cloud_cfg.<property_name>
Out[1]: <property value>

To exit ipython enter 'exit' and hit enter or use 'CTL D'

Starting IPython Interpreter Now 

In [1]: type(cloud_cfg)
Out[1]: insights.core.spec_factory.SerializedOutputProvider

In [2]: type(cloud_cfg.content)
Out[2]: list

In [3]: cloud_cfg.content
Out[3]: ['{"version": 1, "config": [{"subnets": [{"type": "dhcp"}, {"type": "dhcp6"}], "type": "physical", "name": "eth0"}]}']