brandonbloom / fipp

Fast Idiomatic Pretty Printer for Clojure
525 stars 44 forks source link

Add a version of fipp that prints to string #55

Closed bbatsov closed 5 years ago

bbatsov commented 5 years ago

It'd be really useful if there was one such function that people can directly use as the print function of nREPL or its clients. Currently cider-nrepl has to have a dependency on fipp just to tweak it's api a bit. See https://github.com/clojure-emacs/cider-nrepl/blob/master/src/cider/nrepl/pprint.clj#L29

nREPL's print contract is here https://nrepl.org/nrepl/design/middleware.html#_pretty_printing Basically a function that takes an object and map of print options and returns a string.

zprint, puget and pprint (to some degree) already have such functions, so it'd be nice if fipp had one as well.

P.S. You can ignore the part about keywordizing of keys. When we've added pprint support to nREPL I didn't remember that keywords can't be encoded with bencode.

brandonbloom commented 5 years ago

I'd prefer not to add such a function because I don't want to encourage folks to pretty-print to a string. It's occasionally useful to do pr-str for data in messages when you know you have a number, keyword, small vector of those things, etc. However, it's almost always preferable to have streaming primitives for multi-line text, especially that of potentially unbounded size!

I'm sympathetic to both sides of the discussion in this thread about socket repls. Although I still don't think raw pprint-str is a good idea even for the IDE via RPC use-case.

Instead, allow me to make a feature request of you for nREPL. As nREPL exists today, it's trivial to hang it with massive or even infinite output, gobbling up tons of memory, and ultimately requiring me to kill the process. This drives me absolutely nuts as a consumer of nREPL via vim-fireplace. It doesn't have to be this way.

Instead of a function that prints to a string, nREPL should accept two configuration parameters: A streaming print function and the maximum response size. During result printing, output is redirected to a buffer bounded by that size. When writes exceed the capacity of the buffer, the offending write fails with a recoverable error, such that a truncated response can be displayed to users, ideally with a warning.

This could be retrofitted in to nREPL in a backwards compatible way. Assuming a macro with-out-head that is like with-out-str, but performs truncation, nREPL would simply do:

(with-out-head n
  (print (str (printer result)))

This way, either/both stdout and returned strings are included in the resulting response.

As a short-term fix to your specific dependency problem, try requiring inline as in this code:

(defn fipp-pprint
  ([object]
   (fipp-pprint object {}))
  ([object opts]
   (require 'fipp.edn)
   (with-out-str
     ((resolve 'fipp.edn/pprint) object opts))))

I'm open to further discussion about improving nREPL here, or if this solves your problem, feel free to close this issue.

bbatsov commented 5 years ago

Instead, allow me to make a feature request of you for nREPL. As nREPL exists today, it's trivial to hang it with massive or even infinite output, gobbling up tons of memory, and ultimately requiring me to kill the process. This drives me absolutely nuts as a consumer of nREPL via vim-fireplace. It doesn't have to be this way.

I see your point, but nREPL streams values/output/errors even now, albeit most clients can't handle this well, so that's not something that can be solved simply. Also keep in mind that with the bencode transport we're forced to print the results, as bencode can represent only for data types (string, map, list and integer). And, of course, there's also the problem of using some non-streaming printer that's not going to end well. :-)

Instead of a function that prints to a string, nREPL should accept two configuration parameters: A streaming print function and the maximum response size. During result printing, output is redirected to a buffer bounded by that size. When writes exceed the capacity of the buffer, the offending write fails with a recoverable error, such that a truncated response can be displayed to users, ideally with a warning.

How would this differ from passing some max length to the printer function? At some point we discussed adding a cutoff limit to the value/output messages but that doesn't seem very practical, as because the values and output are streamed anyways clients can decide when to cut off the transfer if they wish to.

These days we've got some crazier ideas like https://github.com/nrepl/nrepl/pull/106, but I'm afraid it's unlikely that vim-fireplace would ever make use of this, as I doubt there are edn parsers for vim.

As a short-term fix to your specific dependency problem, try requiring inline as in this code:

That's pretty much what I'm doing in cider-nrepl today, but I don't want to have to depend on fipp just to expose one function that the end users can pass to nREPL. If you recall recently because of this dep everyone was complaining about some warnings from a fipp dependency.

bbatsov commented 5 years ago

This reminded me of one of the old ticket from the early days of nREPL and CIDER https://github.com/clojure-emacs/cider/issues/421

I'm hardly an expert in printing data and the nuances here. I'll also ping here @greglook, as he's the one who came up with the current approach. Because of his work on Puget I assume he knows this stuff way better than me. :-)

brandonbloom commented 5 years ago

but nREPL streams values/output/errors even now

It doesn't matter if the protocol supports streaming the transmission if the server buffers the entire response string before writing to the stream.

How would this differ from passing some max length to the printer function?

My suggestion does not require cooperation from the printer function, so that it would apply universally to any printer. Old-school, unix-style process composition at work!

output are streamed anyways clients can decide when to cut off the transfer if they wish to.

I'm not sure I've properly communicated the problem. Maybe this will clear it up:

When printing (range) in to a string, the server will send 0 bytes to the client. That is, the printer will never terminate and so no streaming will be happening. Eventually, you'll run out of memory. In this context, large is effectively the same as infinite.

These days we've got some crazier ideas like nrepl/nrepl#106, but I'm afraid it's unlikely that vim-fireplace would ever make use of this, as I doubt there are edn parsers for vim.

Assuming I understand that ticket properly, you're talking about the server retaining the returned value and using the datify/nav protocols to explore it interactively. I think that's a great idea! But it's orthogonal to fixing self-inflicted denial-of-service attacks currently possible with the printing/bytes-oriented responses.

I don't want to have to depend on fipp

You don't have to add a dependency to your package. Fipp is very stable and I promise you that it will remain so. You can omit the dependency and ask users to depend on it themselves as part of their printer selection/configuration.

bbatsov commented 5 years ago

It doesn't matter if the protocol supports streaming the transmission if the server buffers the entire response string before writing to the stream.

Yep, I realized this after our initial message exchange.

You don't have to add a dependency to your package. Fipp is very stable and I promise you that it will remain so. You can omit the dependency and ask users to depend on it themselves as part of their printer selection/configuration.

Yeah, but they'll also have to do add the wrapper and this would make it hard for a client tool like CIDER to figure out what print options the printer understands (unfortunately the options map for zprint and fipp is different, and pprint supports a different options as well). Right now the options are inferred from the name of the print function, but down the road this would mean adding a print type and print function config options which might be a bit excessive, but yeah - that's obviously doable. I just want end users to have smooth experience and not have to deal with print wrappers, extra settings and so on.

cgrand commented 5 years ago

@bbatsov said:

I'm afraid it's unlikely that vim-fireplace would ever make use of this, as I doubt there are edn parsers for vim.

See https://github.com/kotarak/vimpire/blob/master/autoload/vimpire/edn.vim

bbatsov commented 5 years ago

Nice! I stand corrected then. :-)

brandonbloom commented 5 years ago

@bbatsov I don't follow most of your last paragraph.

Yeah, but they'll also have to do add the wrapper

Add what wrapper?

this would make it hard for a client tool like CIDER to figure out what print options the printer understands (unfortunately the options map for zprint and fipp is different, and pprint supports a different options as well).

What do print options have to do with printing to string?

Right now the options are inferred from the name of the print function, but down the road this would mean adding a print type and print function config options which might be a bit excessive, but yeah - that's obviously doable.

Maybe I should mention that I don't use Cider and I don't know anything about it. I don't know what it means to infer options in this context. What options? We're talking about streaming and maximum request payload lengths. Do you mean print-length, etc? Fipp supports some of Clojure's standard dynamic variables, or you can parameterize it. I'm not going to change my API to match other printers. If Clojure adds/changes their dynamic vars, which seems relatively unlikely, I may add support for those things. Beyond that, users are on their own to configure Fipp as they choose.

I just want end users to have smooth experience and not have to deal with print wrappers, extra settings and so on.

Then just pick a default printer, include it out of the box, and be done with it. Fipp is a fine choice and a tiny dependency, but I won't be offended if you fork it, choose another, write your own, whatever.

I'm going to close this ticket for now, since I'm quite convinced that there is nothing for Fipp to change here. Again, let me know if I'm totally missing something.

bbatsov commented 5 years ago

Add what wrapper?

The one you mentioned in your original comment. :-)

What do print options have to do with printing to string?

Well, you still care about width, print-length, etc.

Maybe I should mention that I don't use Cider and I don't know anything about it. I don't know what it means to infer options in this context. What options? We're talking about streaming and maximum request payload lengths. Do you mean print-length, etc? Fipp supports some of Clojure's standard dynamic variables, or you can parameterize it. I'm not going to change my API to match other printers. If Clojure adds/changes their dynamic vars, which seems relatively unlikely, I may add support for those things. Beyond that, users are on their own to configure Fipp as they choose.

Yeah, I got that you don't understand what I mean. Maybe this will help - http://www.cider.mx/en/latest/pretty_printing/ Right now fipp is simply a config option, but that comes at the cost of adding it as a dep to CIDER's nREPL middleware simply to add the print to string function. If I don't know how the printer is named (which will be the case when I remove the print to string wrapper) it can't be a toggle anymore. And if you're generating the print options map dynamically you'll need two configs - one for the printer library, one for the printer function.

Then just pick a default printer, include it out of the box, and be done with it. Fipp is a fine choice and a tiny dependency, but I won't be offended if you fork it, choose another, write your own, whatever.

The default is pprint for obvious reasons, but the other options are there for the user's convenience. I don't even use pretty-printing so it's all the same to me.

I'm going to close this ticket for now, since I'm quite convinced that there is nothing for Fipp to change here. Again, let me know if I'm totally missing something.

Fair enough. I understand your perspective and I'll solve this is some other way then.

brandonbloom commented 5 years ago

that comes at the cost of adding it as a dep to CIDER's nREPL middleware simply to add the print to string function

  1. Using a dynamic require and indirecting through a symbol means you don't need to add an explicit dependency.
  2. Fipp is small and stable. It's pretty harmless to just add the dependency. That's what I'd do.
bbatsov commented 5 years ago

It's 2. right now. :-) I opened this ticket hoping to reduce one dep. Anyways, not a big deal.