scicloj / tablecloth

Dataset manipulation library built on the top of tech.ml.dataset
https://scicloj.github.io/tablecloth
MIT License
305 stars 27 forks source link

The stocks example in the docs is broken (column type problem) #153

Open daslu opened 5 months ago

daslu commented 5 months ago

Looking into the Stocks example:

clj -Sdeps '{:deps {scicloj/tablecloth {:mvn/version "7.029.2"}}}}'
Clojure 1.11.2

(require '[tablecloth.api :as tc])
nil

(defonce stocks (tc/dataset "https://raw.githubusercontent.com/techascent/tech.ml.dataset/master/test/data/stocks.csv" {:key-fn keyword}))
#'user/stocks

(-> stocks
    (tc/group-by (fn [row]
                   {:symbol (:symbol row)
                    :year (tech.v3.datatype.datetime/long-temporal-field :years (:date row))}))
    (tc/aggregate #(tech.v3.datatype.functional/mean (% :price)))
    (tc/order-by [:symbol :year]))

Execution error at tech.v3.datatype.datetime.operations/long-temporal-field (operations.clj:336).
Data datatype (:string) is not a date time datatype.

For some reason, the :date column is now parsed as :string type, so the datetime processing does not work:

(:date stocks)
#tech.v3.dataset.column<string>[560]
:date
[2000-01-01, 2000-02-01, 2000-03-01, 2000-04-01, 2000-05-01, 2000-06-01, 2000-07-01, 2000-08-01, Sep 1 2000, Oct 1 2000, Nov 1 2000, Dec 1 2000, Jan 1 2001, Feb 1 2001, Mar 1 2001, Apr 1 2001, May 1 2001, Jun 1 2001, Jul 1 2001, Aug 1 2001...]
genmeblog commented 5 months ago

On which version of JDK does it happen? 11 by a chance?

genmeblog commented 5 months ago

I remember @kiramclean encountered something similar on JDK 11 but I can't find a zulip discussion now.

daslu commented 5 months ago

Oh, thanks.

I ran it on JDK 21 when it failed.

Now, I tried it on JDK 17, and it was ok.

kirahowe commented 5 months ago

I remember running into something like this, too.. bah I'll see if I can dig out the zulip conversation. IIRC it was some very strange issue with JDK versions.. or something ⏳👀

kirahowe commented 5 months ago

yeah in the conversation starting roughly here: https://clojurians.zulipchat.com/#narrow/stream/151924-data-science/topic/tablecloth/near/428193802

Looks like in the end I just wrote it off as a problem with JVM 17.x and moved on -- are you experiencing issues with it still @daslu?

daslu commented 5 months ago

@kiramclean, thanks; I moved to JVM 17.x and then everything seemed fine.