Closed ieugen closed 2 years ago
Do you have a proposal for how to add this to the existing navigator
function without breaking compatibility? We'd need to switch on the class of xml
but I'm not sure what the most idiomatic way to do that is: multimethods? Protocols?
How about something like this?
(defmulti navigator
(fn
([xml] (class xml))
([xml namespace-aware] (class xml))))
(defmethod navigator java.lang.String
([^String xml] (navigator xml false))
([^String xml namespace-aware]
(let [vg (doto (VTDGen.) (.setDoc (.getBytes xml "UTF-8"))
(.parse namespace-aware))]
(Navigator. (.getNav vg)))))
(defmethod navigator (Class/forName "[B")
([^"[B" xml] (navigator xml false))
([^"[B" xml namespace-aware]
(let [vg (doto (VTDGen.) (.setDoc xml)
(.parse namespace-aware))]
(Navigator. (.getNav vg)))))
And here's an implementation using protocols instead:
(defprotocol Navigable
(navigator [xml] [xml namespace-aware]))
(extend-protocol Navigable
java.lang.String
(navigator
([^String xml] (navigator xml false))
([^String xml namespace-aware]
(let [vg (doto (VTDGen.) (.setDoc (.getBytes xml "UTF-8"))
(.parse namespace-aware))]
(Navigator. (.getNav vg))))))
(extend-protocol Navigable
(Class/forName "[B")
(navigator
([^"[B" xml] (navigator xml false))
([^"[B" xml namespace-aware]
(let [vg (doto (VTDGen.) (.setDoc xml)
(.parse namespace-aware))]
(Navigator. (.getNav vg))))))
Are there types other than java.lang.String
and [B
(Java's native byte array) we'd need to implement this for?
Hi,
Thanks for the positive feedback. I think String and byte array are good enough.
IMO String variant should allow to set other charsets than UTF-8. Or we could consider people should use the byte version if they have strings with other encodings.
I thought that InputStream might also be a candidate - but saw that VTDGen can't use streams :( .
Thanks,
It might be best to encourage people to pass a byte array to navigator
and leave the UTF-8 String option as a convenience. That way, people can pass strings with other encodings by passing (.getBytes my-string "ISO-8859-1")
and we keep this a separate concern?
I'm leaning towards the protocol implementation as I can see a future where you could, say, pass a file handler, etc.
Yes, I think protocol implementation is good. I would add the recommendations in as function descriptions / help.
Thanks for providing deps.edn, I think it's going to be very useful.
I've now raised draft PR #9 with this change. Would you please be able to test it with SHA da2f0bb345adb65908aed35cbcef5a393c453b8b and see if that works as expected?
I've just released version 0.2.0 of riveted with this new API:
; Generate a navigator for XML in my-byte-array without namespace support
(navigator my-byte-array)
; Generate a navigator for XML in my-byte-array with namespace support
(navigator my-byte-array true)
We are using this library and I noticed uses the String arg and converts to byte array. I believe it should also provide a byte array + charset API.
One use case is for example we use LMDB and going from bytes > string -> bytes is not efficient.