brandonbloom / fipp

Fast Idiomatic Pretty Printer for Clojure
525 stars 44 forks source link

Babashka compatibility #81

Closed borkdude closed 2 years ago

borkdude commented 2 years ago

Hi @brandonbloom!

Babashka is a Clojure interpreter which is suitable for scripting and has (very) fast startup. It has the ability to execute Clojure libraries from source. Sometimes minor changes are needed to avoid things that babashka cannot (currently) do in a similar way how you would make a library compatible with ClojureScript: through reader conditionals. A recent example of this is the commit in specter.

Recently I've gotten questions about whether fipp and puget (which uses fipp) can run with babashka and I started investigating. I found that with a few tweaks, it can work. Details here: https://github.com/babashka/babashka/issues/1241.

Some things can be fixed in babashka (or the Clojure interpreter library it uses: SCI). Some things are better solved in the library. I will outline those here. Note that these changes can be made in such a way that they only affect behavior for bb while not changing anything for clj/cljs.

  1. in dequeue.cljc fall back on clojure.core vector operations instead of using rrb-vector. I realize this is worse performance-wise but for babashka this may be a reasonable compromise to get things working without pulling in rrb-vector.

  2. in format-hack in ednize.clj start adopting java.time rather then relying on ThreadLocals in clojure.instant. Babashka has full support for java.time while clojure.instant is based on classes prior to java.time. java.time is the modern way to do stuff with dates and times in Java and is available since Java 1.8.

My question to you: are you open to receiving PRs for (some of) the above issues, or should users of babashka maintain a bb-compatible fork somewhere else?

borkdude commented 2 years ago

Here's a screenshot with puget + the above changes made in a local checkout of fipp:

Screen Shot 2022-04-14 at 21 41 16
borkdude commented 2 years ago

Note that [zprint]() already works with bb:

(ns zprint
  (:require [babashka.deps :as deps]))

(deps/add-deps '{:deps {org.babashka/spec.alpha {:git/url "https://github.com/babashka/spec.alpha"
                                                 :git/sha "1a841c4cc1d4f6dab7505a98ed2d532dd9d56b78"}
                        zprint/zprint {:mvn/version "1.2.3"}}})

(require '[zprint.core :as zp])

(println (zp/zprint-file-str (slurp *file*) "yolo" {:color? true :style :dark-color-map}))

but some users would like to use fipp as well.

borkdude commented 2 years ago

Submitted a PR to address point 2: https://github.com/brandonbloom/fipp/pull/82 I think that change would be good in general, not only for bb.

brandonbloom commented 2 years ago

Hey @borkdude, thanks for the detailed writeup and PR #82 (merged!).

In general, I'm willing to make improvements (cleanup, fixes, etc) that also extend compatibility, but would like to avoid accreting compatibility cruft. On that front, I'm disinclined to eliminate the dependency on rrb vectors & would instead encourage you to upstream the compatibility fixes there.

Perhaps worth mentioning is that Fipp used to use finger-trees for ClojureScript (See a2bd18a92a8480bd84a0410ae9041ed6f2fedcf0), but rrb-vectors grew support there. I might also consider an alternative dependency that provides a dequeue interface more abstractly.

borkdude commented 2 years ago

On that front, I'm disinclined to eliminate the dependency on rrb vectors

Just to be clear, I'm not asking to eliminate the dependency, the change would be hidden behind reader conditionals for bb. clj + cljs would still be using rrb. But I understand that you would not like to deal with compatibility for more than clj and cljs.

borkdude commented 2 years ago

I haven't looked too deeply into this library, but Java also comes with a Deque implementation:

https://docs.oracle.com/javase/7/docs/api/java/util/Deque.html

Perhaps you've already considered that.

brandonbloom commented 2 years ago

Needs a functional/persistent dequeue.

I didn't mean to cause you to close this issue. Still open to small changes to support Babashka, but my hope is that can be as easy as bumping an rrb-vector dependency.

borkdude commented 2 years ago

A few thoughts:

brandonbloom commented 2 years ago

Babashka could decide to include rrb as a built-in dep.

Looking at this patch here: https://github.com/babashka/babashka/issues/1241#issue-1204914375

I don't know anything about Babashka, but do you have some sort of library/namespace/function stub/patch mechanism? Seems like you could stub either fipp.deque or clojure.core.rrb-vector and you only need that one catvec function?

borkdude commented 2 years ago

@brandonbloom Babashka is able to "fake" any Clojure function via the SCI (interpreter) configuration, so that's certainly an option to try. I will try that soon.

borkdude commented 2 years ago

I have a development version of babashka here (for macOS):

https://output.circle-artifacts.com/output/job/319f007f-70c2-4b1b-af39-e7fb78269d33/artifacts/0/release/babashka-0.8.2-SNAPSHOT-macos-amd64.tar.gz

which can run fipp (the current version from master).

What I've done is include rrb vector proper, but only the catvec function. This adds 350kb to the binary. Might be worth it, but I'll wait for feedback in https://github.com/greglook/puget/issues/56, to see if I can keep it at that for compatibility for fipp + puget.

$ rlwrap ./bb
Babashka v0.8.2-SNAPSHOT REPL.
Use :repl/quit or :repl/exit to quit the REPL.
Clojure rocks, Bash reaches.

user=> (babashka.deps/add-deps '{:deps {fipp/fipp {:deps/manifest :deps :git/url "https://github.com/brandonbloom/fipp" :git/sha "efdf87f22a6fcc08cef5017c14769df29fc37850"}}})

nil
user=> (require '[fipp.edn :as fipp])
nil
user=> (fipp/pprint (range 45))
(0
 1
 2
 3
 4
borkdude commented 2 years ago

Finally got to the point that hashp is working in babashka (current master version)

;; add fipp from master
(babashka.deps/add-deps '{:deps {fipp/fipp {:deps/manifest :deps :git/url "https://github.com/brandonbloom/fipp" :git/sha "efdf87f22a6fcc08cef5017c14769df29fc37850"}}})
;; add newest zprint which is already compatible with bb (hashp also loads zprint)
(babashka.deps/add-deps '{:deps {zprint/zprint {:mvn/version "1.2.3"}}})
;; add babashka fork of clojure spec
(babashka.deps/add-deps '{:deps {org.babashka/spec.alpha {:git/url "https://github.com/babashka/spec.alpha"
                         :git/sha "1a841c4cc1d4f6dab7505a98ed2d532dd9d56b78"}}})
;; add hashp
(babashka.deps/add-deps '{:deps {hashp/hashp {:mvn/version "0.2.1"}}})

(require '[hashp.core])
(set! *data-readers* {'p hashp.core/p*})
Screen Shot 2022-04-30 at 11 35 27
borkdude commented 2 years ago

If you can release a new version of fipp, then we don't need the git/sha stuff anymore and then I think we can close this issue.

brandonbloom commented 2 years ago

What I've done is include rrb vector proper, but only the catvec function. This adds 350kb to the binary.

I'm not sure I follow, but I am curious: What does it mean to add it "proper" but only one function?

brandonbloom commented 2 years ago

deployed: https://clojars.org/fipp/versions/0.6.26

borkdude commented 2 years ago

@brandonbloom It means I've only exposed the catvec function to the interpreter:

https://github.com/babashka/babashka/blob/master/feature-rrb-vector/babashka/impl/rrb_vector.clj

GraalVM native-image will analyze what is used and will not include other vars in the native image, if they are not reachable.

borkdude commented 2 years ago

Thanks!

brandonbloom commented 2 years ago

It means I've only exposed the catvec function to the interpreter

Since catvec returns an object of the rrb vector class, presumably that means a fair amount of code is then included?

borkdude commented 2 years ago

@brandonbloom Yes, that fair amount of code yields about 350kb.