venantius / yagni

A Leiningen plugin for finding dead code
Eclipse Public License 1.0
219 stars 10 forks source link

Support for vars registered as tagged literal callbacks #24

Open moea opened 9 years ago

moea commented 9 years ago

i.e. dealing with data_readers.clj

(Thanks for the project, it's already proving useful to me)

venantius commented 9 years ago

Sorry, could you provide a link to the source file in question?

moea commented 9 years ago

I was referring to the general mechanism of [tagged literals](http://clojure.org/reader#The Reader--Tagged Literals).

e.g. If I have a project with a data_readers.clj like so:

{my-namespace/my-symbolic-tag actual.qualified/var
 ...}

Without explicit support for data_readers.clj (which is not a valid Clojure source file - the keys are unquoted symbols), spurious warnings would be generated unless I declared actual.qualified/var as an entrypoint, alongside any other vars I listed a values in the map. It's not the most widespread feature, but it's an additional source of usage information.

I don't think it would be particularly fruitful to try and track the uses of the literals themselves, but treating the handlers as entrypoints seems sensible.

It may just be a weird suggestion, though.

venantius commented 9 years ago

No, it seems to be a perfectly valid suggestion. And I like your proposal for how to integrate it; though I'd be curious in exploring a little bit further the possibility of tracking the literals themselves. After all, if we're defining literals that don't get used then those are just as bad on some level, surely? I suppose I don't really know what these literals look like in practice, and I'm having trouble grokking it by looking at the codebases linked to from your google search. Would you mind providing what you think an idiomatic example looks like?

moea commented 9 years ago

It's not an amazing example, but it's linkable, and I wrote it: there's a couple of literals defined here (one confusingly called "literal"). In that project, they're used in situations like this to clarify some detail of a potentially ambiguous data structure - a little nicer than just writing ^:hildebrand/path and hoping.

They're also used to carry stuff around: https://github.com/miner/tagged

venantius commented 9 years ago

Thanks, that's a very useful example. I'll see if I can get something workable for the 0.1.2 branch going here.

venantius commented 9 years ago

Okay, so, two things: first, I went down the rabbit hole and read your blog post about Hildebrand, and you're hilarious. Henceforth I shall read your blog for great delight.

Second: after poking around at this for a bit I think there might be a sneaky way to take note of the usage of reader literals, but I think I'm going to want to let the idea marinate a little further. The short version of what I'm thinking about looks like this:

If you check out https://github.com/venantius/yagni/blob/master/src/yagni/namespace/form.clj#L8, you'll see where we actually pull in the source code for a given interned var. Unfortunately, we immediately hand that source code over to read-string, at which point the Clojure reader kicks in and we lose any knowledge about whatever literals might have existed as the reader replaces them with the relevant output of the tagged literal.

What we could do instead is, given the existence of a data_readers.clj on the classpath, is to string search the relevant source code for the contained literals before handing it over to read-string.

The question then is where to keep track of the literals; in the main Yagni graph, or in some sort of separate counter. I'm tempted to keep them in the main graph but to use some special character marker within the graph to clearly delineate the nature of the reader (the # comes to mind as an obvious candidate).

Anyways, would love to hear your thoughts on this approach.

moea commented 9 years ago

(About the post: thanks - I'm glad you liked it!)

On losing track when you hand over to read-string, you could potentially do something nuts like this:

(def reader-invocations (atom {}))

(defn data-reader! [sym form]
  (swap! reader-invocations assoc sym true)
  form)

(binding [*data-readers*
          (into {}
            (for [[sym var] *data-readers*]
              [sym (partial data-reader! sym)]))]
  (read-string "#hildebrand/literal 1"))

Passing the state around could be a little awkward. Instead of some global thing, you could do something smarter/more local at binding time.

venantius commented 9 years ago

I haven't forgotten about this, just so you know. I spent a while thinking about how to do this with regular expressions before realizing there was a far easier way of doing it with simpler string handling. Hope to start implementing the intended approach this week.

venantius commented 9 years ago

Recent work I've done in Glow should open up the path for general form parsing, including (in particular) reader literals. Hadn't really occurred to me until today that the two projects would end up intersecting like this.