JuliaLogging / TensorBoardLogger.jl

Easy peasy logging to TensorBoard with Julia
MIT License
102 stars 28 forks source link

List of Tensorboard Plugins and Julia Types #10

Open PhilipVinc opened 5 years ago

PhilipVinc commented 5 years ago

This is a list of all Tensorboard plugins and types that they (potentially) can accept. I want to see the overlap among different types.

Extra: (This is something I need for myself and is not currently supported by TensorBoard). Eventually I would like to contribute to tensorboard a plugin to show a whole plot/curve at each iteration (very similar to what PR curve does). Tensorboard dev team is also considering this but apparently they don't have the time to work on it ATM.

oxinabox commented 5 years ago

Text mode

Text mode seems really strong, because it will render markdown. docs, python demo And markdown is approximately a superset of HTML, since you can embed HTML within it. It should be possible to use repr(MIME"text/markdown, datum), with fallbacks toMIME"text/html"andMIME"text/plain`. in order to displace basically anything, as a fallback. Probably add some more structure to that than just dumping the text representation, e,g. the time and step and such.

oxinabox commented 5 years ago

I do think that supporting everything in the logger via a function based API, and wrapping and dispatching that API in the logging API is two steps of the task. I.e. one can wrap all the plugins and worry about deciding which to use when you @info them seperately.

c42f commented 5 years ago

Btw, regarding text mode - I want to render textural logs as markdown-formatted text by default at some stage (in Logging.ConsoleLogger) so that's an extremely helpful "coincidence". (Only reason it hasn't been done was that stdlib Markdown was a bit awkward to use when adding extra decorations to the left of the the markdown text last time I tried.)

PhilipVinc commented 5 years ago

@c42f, @oxinabox A reflection regarding using Wrapper types to dispatch to the right back-end, as we had briefly discussed in #9

For example, just present normal 2D arrays as images; if you want a histogram as presentation I think this implies you're interpreting the array as samples from some distribution, and you should wrap the array appropriately in a Samples wrapper. (Choose a better name of course ;-) ).

One potential issue I foresee with wrapper types is that when you compose loggers ('a la DemuxLogger or switch from Tensorboard to anything else that is not aware about those wrapper types, you start to introduce unnecessary noise.

For example, I recently started composing two loggers, Tensorboard and a very basic dump-to-MVHistory logger, so that I can monitor my simulations in tensor board and keep a copy of all the data in a Julia-friendly format. It's annoying to see wrapper types in the MVHistory.

oxinabox commented 5 years ago

hmm yes, that is annoying. That is a very good point.

The alternative is magic names. So for say logging a Vector to the text logger, @info x x_tensorbroad=TB_Text and so the Tensorboad logger would if it gets a argument called x, and an argument called x_tensorboard, would then call the function specified by x_tensorboard on x before deciding which logger. So in the above it would call TB_text(x).

This could also be used to allow other preprocessing that you only want to do for the TensorBoardLogger, e.g. you might want to so x_tensorboard = y->sum(y;dim=1)

Other loggers would just see the original x, and well as the extra magic argument. So they would still be able to handle x in their favoured way.

c42f commented 5 years ago

when you compose loggers or switch from Tensorboard to anything else that is not aware about those wrapper types, you start to introduce unnecessary noise.

Can you give a concrete example? I still hold to my point that wrappers are the appropriate way to add certain kinds of meaning to key value pairs (eg, "interpret this array as samples drawn from a distribution" => histogram formatting).

I do think the idea of adding formatting hints such as x_tensorboard into the log statements themselves will never work nicely with other backends and will just cause problems in the long term.


Let's step back a bit to the general problem. At a high level I think this is a classic case of the need to separate content (the log events) from presentation (the log sinks), but we haven't figured out how to hook the presentation onto the content yet with some kind of pattern matching.

Perhaps it would make sense to consider an analogy with html/css. Primary keys for pattern matching the content in that case are element_name,class,id (I think? I'm no expert whatsoever), plus nesting in the document hierarchy. To continue the analogy, at the content level maybe we have

Currently the log keys are generally used for variable names (or at least, that's how I use them) and there's very little in the way of guidelines for key naming. So one cannot expect them to be consistent between libraries in a way which would allow class-like pattern matching for formatting. But maybe they could or should be somehow?

shashikdm commented 5 years ago

I agree with @c42f. Wrappers are elegant solution to the ambiguity problem. Since automatic dispatch is already in place, one can give raw data (without any wrapper) for simple datatypes such as String, Real, Array when using DemuxLogger. So both loggers can work smoothly. But it can't be helped when one has to use wrapper (such as in case of images) 🤔.

shashikdm commented 5 years ago

I would like to start working on TBtext wrapper soon. Following are the datatypes that I believe TBtext should handle.

@PhilipVinc @oxinabox @c42f Please let me know if TBtext should handle any other datatype .

oxinabox commented 5 years ago

Please let me know if TBtext should handle any other datatype .

It should handle literally every datatype. It doesn't need to do any kind of handling. It just needs to control the dispatch. We already implemented all that in the https://github.com/PhilipVinc/TensorBoardLogger.jl/blob/de969bdd31d5ace88b484a76e77bea5ac08c1b59/src/Loggers/LogText.jl#L12

PhilipVinc commented 5 years ago

Can you give a concrete example? I still hold to my point that wrappers are the appropriate way to add certain kinds of meaning to key value pairs (eg, "interpret this array as samples drawn from a distribution" => histogram formatting).

My current setup involves a DemuxLogger which forwards log messages both to TensorBoard and to an extremely simple logger (let's call it MVHistoryLogger) which pushes all incoming messages to a MVHistory.

If I use a wrapper type TBImage to log a 2D Matrix so that TensorBoard it as a 2D image

@info "" mymatrix=TBImage(my2ddata)

then in my matrix data will be stored wrapped inside the TBImage type. An additional complication of this is that if I serialise the MVHistory holding this data with JLD2, I will also need to load TensorBoardLogger otherwise he might spit out errors because he does not recognise the type.

-- I do agree that using Wrapper types is the most elegant solution when using only one logger. I just see some problems arising when mixing several loggers. Unless all loggers are aware of this preprocess machinery and of wrapper types (by splitting this logic out into a separate package).

shashikdm commented 5 years ago

I suggest instead of making a struct and using it as wrapper, we can create function which return data along with some metadata which tells TBLogger which logger to use. eg

function TBimage(data::Array)
        metadata = "log_image"
        (data, metadata)
end

then in preprocess function we can check if metadata exists, use that logger. else automatic dispatch

shashikdm commented 5 years ago

Downside is that metadata will also appear in the other logger.

c42f commented 5 years ago

If I use a wrapper type TBImage to log a 2D Matrix so that TensorBoard it as a 2D image

Thanks, it's great to have a concrete example. Certainly it's less than ideal to have the wrappers be TensorBoard-specific types because other backends then need to depend on TensorBoard for correct formatting.

Of course we could have some selected wrappers in stdlib Logging, but that wouldn't generalize either. @shashikdm's suggestion of adding metadata makes sense. We'd need a type for this which is more specific than a plain Tuple — you don't want to confuse 2-element Tuple (value1,value2) with (value,metadata).

Having said that, I think a more idiomatic alternative would be to put some "log key-value matching" functions into a central location (perhaps LoggingExtras for now, with the view to eventually moving it into stdlib Logging). The existing multimedia display system seems very related, ie, display(d::AbstractDisplay, x) and show(io, mime, x) etc. It seems like we should somehow hook into those or extend them for this purpose.

Relevant prior discussion https://github.com/JuliaLang/julia/pull/29397 See also https://github.com/JuliaLang/julia/pull/27430

PhilipVinc commented 5 years ago

I agree with the fact that the multimedia display system is related. For example, if all wrapper types were subtypes of some abstract type WrapperLogType end, we could fix the formatting of wrapped types by defining

show(io, mime, x::WrapperLogType) = show(io, mime, x.data) 

We could then define a new MIME type "TensorBoard" and put logic relevant for us there.

But

c42f commented 5 years ago

Agreed, show and display aren't general enough even though there's a tantalizing connection.

Possibly more relevant is Jameson's comment here: https://github.com/JuliaLang/julia/pull/29397#issuecomment-440116646? I didn't get around to looking at that yet.

xukai92 commented 4 years ago

tensorboardX supports logging a matplotlib object directly. How hard is it to implmenet this feature here?

oxinabox commented 4 years ago

Not too hard really, use Plots.savefig then hit up the stuff for displaying images.

xukai92 commented 4 years ago

Cool. I will give a try.

oxinabox commented 4 years ago

any Plots dep should be hidden behind Requires.jl

xukai92 commented 4 years ago

BTW I found the corresponding helper func in tensorboardX: https://github.com/lanpa/tensorboardX/blob/master/tensorboardX/utils.py#L2

PhilipVinc commented 4 years ago

@oxinabox We don't have yet implemented Requires in TensorBoardLogger. This makes me think that we could hide behind Requires.jl a default dispatch for Plots.Plot objects to call Plots.savefig on them.

If I have a minute I'll do this with #39

oxinabox commented 4 years ago

@oxinabox We don't have yet implemented Requires in TensorBoardLogger.

Huh so we don't. I think we should use Requires aggressively in this package.

I know i was hesitant before, but now I think we should add lots of deps and visualize for all of them via using Requires.jl

xukai92 commented 4 years ago

The new TFBoard also supports a plane called hyperparameters (https://www.tensorflow.org/tensorboard/r2/hyperparameter_tuning_with_hparams). How should one add this?

PhilipVinc commented 4 years ago

Interesting... Adding this to TBL.jl should be quite easy:

  1. You should go on the TensorBoard repository and find the .proto file for the hyperparamters plugin;
  2. Compile the .proto to Julia files with ProtoBuf.jl, and include them in the package;
  3. We should decide on a type used internally (and probably also exposed as API) to signal that this data should be serialised as hyper parameters. Something similar to TBImage or TBAudio. Let's call it TBHyperParams for now;
  4. Write the a function hyperparams_summary(name::String, data::TBHyperParams) that serialises the data to the correct protobuffer. You can have a look at text_summary which implements something similar;
  5. Specify how TBHyperParams should be managed by the dispatch machinery by declaring the two functions
    preprocess(name,   val::TBHyperParams, data) = push!(data, name=>val)
    summary_impl(name, val:: TBHyperParams) = hyperparams_summary(name,val)

Most of the work will be figuring out how to do [4]. To do it, as the documentation is very scarce, you must go through tensor board or tensorboardX's source code.

If you'd like to give this a try I'd be happy to guide you.

xukai92 commented 4 years ago

I'm very new to protocol buffer. For 1. I found the folder for hyperparameters is https://github.com/tensorflow/tensorboard/tree/master/tensorboard/plugins/hparams. Which files should I try to convert?

And how do one use ProtoBuf.jl? I tried run(ProtoBuf.protoc(--julia_out=jlout tensorboard/plugins/hparams/api.proto)) (suppose api.proto is what I want to convert) but this gives me errors starting with Plugin output is unparseable.

PhilipVinc commented 4 years ago

Hey @xukai92 I'm sorry I never answered, but I was overwelmed in October/November from my PhD defense. If you want to get back at this, let me know.

Just for reference, for anyone who attempts this in the future, you should take the .proto files in that folder, and compile them with Protobuffer. AThey probably depend on the Proto files of the main TensorBoard package, so you should pass that folder too, but I should look into it again...

c42f commented 4 years ago

Relevant crossref to this discussion of dispatch in logging messages is the Progress type @tkf just introduced in ProgressLogging, and which will also be supported in TerminalLogger: