Lisp-Stat / data-frame

Data frames for Common Lisp
https://lisp-stat.github.io/data-frame
Microsoft Public License
27 stars 4 forks source link

[BUG] [CSV] First Column Cannot be removed #21

Closed hauter closed 11 months ago

hauter commented 1 year ago

image

image

The first column cannot be removed from the dataframe read from csv file.

https://github.com/Lisp-Stat/data-frame/issues/20

I see. I think it's great that you're learning Common Lisp with Lisp-Stat. Lisp-Stat is stable enough to be used for analysis, but the edge cases aren't well covered in the test suites. For example removing all the columns of a data-frame isn't something that you normally see in practice because that would leave you no data to work with! You can remove one column though: (remove-columns df1 '(name)) works. The examples in the IPS (Introduction to the Practice of Statistics) repo might be helpful if you're looking to use JupyterLab for your analysis.

However, I'm glad you're testing these things! We'll fix everything you find. Would you mind opening up a new issue for this latest bug?

hauter commented 1 year ago

image

(remove-columns *df1* '(name) also doesn't work.

snunez1 commented 1 year ago

Hmm. That is odd. Here's a trace of my session:

LS-USER> (defparameter *df1* (read-csv #P"~/Desktop/test.csv"))
*DF1*
LS-USER> (print-data *df1*)

;;   NAME  APP  
;; 0 a.com app_a
;; 1 b.com app_b
;; 2 c.com app_c
NIL

LS-USER> (remove-columns *df1* '(name))
#<DATA-FRAME (3 observations of 1 variables)>
LS-USER> (print-data *)

;;   APP  
;; 0 app_a
;; 1 app_b
;; 2 app_c
NIL
LS-USER> 

The only difference I can see is that you're not in the LS-USER package. This might mean some of the functions you need aren't available, though I'd have expected a warning. In any case, the system is designed for you to do most work in the LS-USER package, which also uses CL, so anything you can do from CL-USER you can do from LS-USER. Give that a try and see how it goes.

hauter commented 1 year ago

Going in to LS-USER package doesn't make sense. But I found the root problem.

image

A csv file with encoding of UTF-8 with BOM shall reproduce the problem. And I changed the encoding to UTF-8, everything is OK.

snunez1 commented 1 year ago

Glad you found the solution. There was another discussion on this topic here: https://stackoverflow.com/questions/46260357/sbcl-encoding-and-decoding-characters-without-actual-i-o

I'll document this behaviour and leave this case open in case someone else runs into the problem

hauter commented 1 year ago

Nice, thank you for your support. I'll keep trying other features of Lisp-Stat.