namoopsoo / play-clj-ml

Messing around with machine learning in clojure
0 stars 0 forks source link

Milestone 2: Create first super simple classifier on the basic geo annotated data #2

Closed namoopsoo closed 6 years ago

namoopsoo commented 6 years ago

Make simple linear model using the extremely simple single independent variable dataset of just the borough,

(def fname "201510-citibike-tripdata.simple.csv")

(defn load-csv-data [fname] (let [ table (with-open [reader (io/reader fname)] (->> (csv/read-csv reader) (mapv pass))) header-row (first table) columns (->> (rest table) (parse-table-as-doubles) (mtrix/transpose))] {:header header-row :columns columns} ))

* ...
```clojure
(defn make-super-simple-linear-model
  [data-table]
  (let [
        Y (nth (:columns data-table) 2)
        X (nth (:columns data-table) 1)
        simple-linear-model (linear-model Y X)
        ]
    simple-linear-model))

; on repl...
matrix-project.core=> (def simple-model (make-super-simple-linear-model simple-data))
#'matrix-project.core/simple-model

matrix-project.core=> (simple-model :coefs)
[0.711438996099119 0.6281347329684195]
matrix-project.core=> (keys simple-model)
(:y :sse :msr :design-matrix :mse :t-probs :adj-r-square :df :coef-var :residuals :ssr :sst :coefs :f-stat :r-square :f-prob :t-tests :x :std-errors :fitted :coefs-ci)

matrix-project.core=> (:sse simple-model)
55234.16477182651
matrix-project.core=> (:mse simple-model)
0.06213157759542773
matrix-project.core=> (Math/sqrt (:mse simple-model))
0.24926206609796792
matrix-project.core=> (:r-square simple-model)
0.39422113636387146

; matrix-project.core=> (def inputs [1 2 3])

'matrix-project.core/inputs

matrix-project.core=> (simple-predict simple-model inputs) (1.3395737290675385 2.0510127251666574 2.7624517212657764)

* bringing back the transition numbers from #1 ...
```python
                                     unit
start_sublocality end_sublocality        
1                 1                 67863
                  2                 17267
                  3                  3790
2                 1                 16972
                  2                771653
                  3                  2133
3                 1                  3595
                  2                  1921
                  3                  3795