scicloj / tablecloth

Dataset manipulation library built on the top of tech.ml.dataset
https://scicloj.github.io/tablecloth
MIT License
289 stars 24 forks source link

tc/dataset creation of non-link-csv not working #109

Closed awb99 closed 1 year ago

awb99 commented 1 year ago

Hi!

I am using an api that gives back a string that is essentially a csv file. I would like to use tablecloth to create the dataset. However, this pathway is not supported right now. I think it should be supported; this usecase is not that uncommon I would think.

This is how to reproduce the error:

Example of the data that is in the csv file: (def csv "09/01/2023,26.73,26.95,26.02,26.1,337713\r\n")

(tc/dataset csv {:dataset-name "kibit" :header-row? false :separator "," } ) ;; => kibit [1 2]: ;;
;; | :$value | :$error | ;; |------------------------------------------|----------------------------------------| ;; | 09/01/2023,26.73,26.95,26.02,26.1,337713 | Unrecognized read file type: :1,337713 |

genmeblog commented 1 year ago

When string is passed TMD expects a filename and tries to read a file.

You can either parse a string using external library (eg. https://github.com/clojure/data.csv) before passing to the dataset creator or convert it to the input stream and set :file-type to :csv (https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.html#var--.3Edataset)

In general 90% of creation of the dataset by the tablecloth is passed directly to tech.v3.dataset/->dataset function and this issue should go there.

Closing here.