chdb-io / chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
https://clickhouse.com/chdb
Apache License 2.0
2.13k stars 75 forks source link

Feature Request: Support DWARF input format #127

Closed vivekseth closed 10 months ago

vivekseth commented 1 year ago

You have to provide the following information whenever possible.

Describe what's wrong

A clear and concise description of what works not as it is supposed to.

The readme links to this documentation page page for the formats chdb supports.

According to the sample code here it seems that you should be able to run a query like this to read data from a DWARF file:

SELECT
    unit_name,
    count() AS c
FROM file('programs/clickhouse', DWARF)
WHERE tag = 'subprogram' AND NOT has(attr_name, 'declaration')
GROUP BY unit_name
ORDER BY c DESC
LIMIT 3

However, when I try running this command,

python3 -m chdb "select * from file('SomeFile', 'DWARF')" JSON

I get the following output:

Code: 73. DB::Exception: Unknown format DWARF. (UNKNOWN_FORMAT)

A link to reproducer in https://fiddle.clickhouse.com/.

This link reproduces the issue: https://fiddle.clickhouse.com/5271b72e-a405-4826-bff5-a6cb8772b11c

Does it reproduce on recent release?

The list of releases

I'm using chdb 0.14.2

Enable crash reporting

If possible, change "enabled" to true in "send_crash_reports" section in config.xml:

<send_crash_reports>
        <!-- Changing <enabled> to true allows sending crash reports to -->
        <!-- the ClickHouse core developers team via Sentry https://sentry.io -->
        <enabled>false</enabled>

How to reproduce

Expected behavior

A clear and concise description of what you expected to happen.

Either the DWARF format should work, or it should be removed from the list of supported input formats. If users need to use a different format string, then the documentation should be updated to reflect this.

Error message and/or stacktrace

Code: 73. DB::Exception: Unknown format DWARF. (UNKNOWN_FORMAT)

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

lmangani commented 1 year ago

You can see the currently supported formats using SELECT * FROM system.formats

Dwarf support is not currently compiled into chdb (only the most popular formats are included to save space) so I would classify this as a feature request.

vivekseth commented 1 year ago

yeah this totally makes sense as a feature request then.

Would it make sense to update the readme to include that only the most popular formats are included, and how to get a list of the supported comments?

I can probably submit a quick PR for that if you want

lmangani commented 1 year ago

Thanks @vivekseth our public demo can answer this question in realtime so perhaps we can link that instead of static text to maintain

auxten commented 1 year ago

yeah this totally makes sense as a feature request then.

Would it make sense to update the readme to include that only the most popular formats are included, and how to get a list of the supported comments?

I can probably submit a quick PR for that if you want

PR is really really welcome. FYI, most compiling flags are controlled here: https://github.com/chdb-io/chdb/blob/main/chdb/build.sh

lmangani commented 1 year ago

@vivekseth available for testing https://pypi.org/project/chdb/0.16.0rc2