mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.51k stars 63 forks source link

Panic while reading csv file with more than 300 empty records #89

Closed sid-habu closed 1 year ago

sid-habu commented 1 year ago

We stumbled upon a use case where a csv file had more than 300 empty rows—doing any operation on the file results in a panic due to divbyzero

Version v1.17.10

Sample File that fails

empty.csv

Sample Query that fails

csvq "select h1 from empty"
panic: runtime error: makeslice: cap out of range

goroutine 35 [running]:
github.com/mithrandie/csvq/lib/query.readRecordSet.func1()
    github.com/mithrandie/csvq/lib/query/load_view.go:1233 +0x12c
created by github.com/mithrandie/csvq/lib/query.readRecordSet
    github.com/mithrandie/csvq/lib/query/load_view.go:1216 +0x17c

We also use this library in an app server and the panic in the load_view.go:1233 goroutine crashes our server as it can't be recovered in the parent goroutine.

As a stop-gap solution, we are patching the library with this fix in our fork https://github.com/deklareddotcom/habu-csvq/pull/1

What are your thoughts on the bug? Please let me know if you have any better suggestions in fixing it - I can help in contributing back

ondohotola commented 1 year ago

In the meantime one can get qsv and do

qsv dedup empty.csv | csvq “select * from stdin”

el

-- Dr. Eberhard W. Lisse \ / Obstetrician & Gynaecologist @. / | Telephone: +264 81 124 6733 (cell) PO Box 8421 Bachbrecht \ / If this email is signed with GPG/PGP 10007, Namibia ;____/ Sect 20 of Act No. 4 of 2019 may apply On 5. Nov 2022 at 10:56 +0200, Siddharth Sharma **@.***>, wrote:

We stumbled upon a use case where a csv file had more than 300 empty rows—doing any operation on the file results in a panic due to divbyzero Version v1.17.10 Sample File that fails empty.csv Sample Query that fails csvq "select h1 from empty"

panic: runtime error: makeslice: cap out of range

goroutine 35 [running]:

github.com/mithrandie/csvq/lib/query.readRecordSet.func1()

   github.com/mithrandie/csvq/lib/query/load_view.go:1233 +0x12c

created by github.com/mithrandie/csvq/lib/query.readRecordSet

   github.com/mithrandie/csvq/lib/query/load_view.go:1216 +0x17c

We also use this library in an app server and the panic in the load_view.go:1233 goroutine crashes our server as it can't be recovered in the parent goroutine. As a stop-gap solution, we are patching the library with this fix in our fork deklareddotcom#1 What are your thoughts on the bug? Please let me know if you have any better suggestions in fixing it - I can help in contributing back — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

sid-habu commented 1 year ago

@ondohotola Thank you for the suggestion. That surely works for the CLI. However, in our app we use the csvq-driver https://github.com/mithrandie/csvq-driver that internally uses csvq as a lib

mithrandie commented 1 year ago

This bug has been fixed in version 1.17.11.

derekmahar commented 1 year ago

In the meantime one can get qsv and do qsv dedup empty.csv | csvq “select * from stdin”

Where can I find qsv?

derekmahar commented 1 year ago

In the meantime one can get qsv and do qsv dedup empty.csv | csvq “select * from stdin”

Where can I find qsv?

I found qsv.

ondohotola commented 1 year ago

Next Door :-)-O

Indeed that's the one.

el

On 05/11/2022 17:45, Derek Mahar wrote:

    In the meantime one can get qsv and do qsv dedup empty.csv |
    csvq “select * from stdin”

Where can I find qsv?

I found qsv https://github.com/jqnatividad/qsv.

-- Dr. Eberhard W. Lisse \ / Obstetrician & Gynaecologist @.** / | Telephone: +264 81 124 6733 (cell) PO Box 8421 Bachbrecht \ / If this email is signed with GPG/PGP 10007, Namibia ;____/ Sect 20 of Act No. 4 of 2019 may apply

sid-habu commented 1 year ago

@mithrandie That was quick, thank so you much