JuliaLogging / TensorBoardLogger.jl

Easy peasy logging to TensorBoard with Julia
MIT License
102 stars 28 forks source link

Load logged value back into Julia? #38

Closed jessebett closed 4 years ago

jessebett commented 5 years ago

Is there a convenient way to load the logged values?

For instance, I've logged a bunch of scalars in the usual way

using TensorBoardLogger, Logging
tb_logger = TBLogger("path/to/logfile")

function cb()
    with_logger(tb_logger) do
        @info "loss" Flux.data(loss())
    end
end

and now I want to plot/analyze those values back inside Julia. What's the recommended way to do this?

PhilipVinc commented 5 years ago

at the moment this is not supported.

It's true that It would be a nice feature.

jessebett commented 5 years ago

Hm, okay, in the meantime I will also manually log these quantities into something like a BSON.

From the export section of TensorBoard GitHub it points to the train.summary function for exporting information from TensorBoard. https://www.tensorflow.org/api_docs/python/tf/train/summary_iterator

oxinabox commented 5 years ago

@jessebett you can probably use ProtoBuf.jl to load things out of the log event.pb files. using the protocol buffer definitions included in TensorBoardLogger.jl

but better and easier I think would be to just export them out of TensorBoard, via the TensorBoard web-interface which lets you export to CSV or JSON image see https://stackoverflow.com/a/42358524/179081

I used it once when a reviewer requested that I add plots for the training loss to my paper, and I didn't want to rerun experiments. Of course that is sub-ideal if you want to export a ton of them.

If you do want to log them into BSON, writting a BSON logging sink wouldn't be too hard. (Though, I think BSON doesn't naturally support append, so maybe better would be Feather?) Such a logging sink would be good to have in LoggingExtras.jl, and then you would just use a DemuxLogger(TBLogger("logs_tb"), BSONLogger("logs_bson"))

PhilipVinc commented 5 years ago

Just for reference, reading back event files is quite easy:

using TensorBoardLogger

function read_event(f::IOStream)
    header = read(f, 8)
    crc_header = read(f, 4)

    # check 
    crc_header_ck = reinterpret(UInt8, UInt32[TensorBoardLogger.masked_crc32c(header)])
    @assert crc_header == crc_header_ck

    # DATA
    data_len = first(reinterpret(Int64, header))
    data = read(f, data_len)
    crc_data = read(f, 4)

    # check
    crc_data_ck = reinterpret(UInt8, UInt32[TensorBoardLogger.masked_crc32c(data)])
    @assert crc_data == crc_data_ck

    pb = PipeBuffer(data)
    ev = TensorBoardLogger.readproto(pb, TensorBoardLogger.Event())
    return ev
end

fname="youreventfile"
f=open(fname, "r")

ev1 = read_event(f)
ev2 = read_event(f)
...

The slightly more complicated part would be destructuring the event files to extract the summaries. In principle we could add a method that dumps everything into a ValueHistories object....

jessebett commented 5 years ago

@oxinabox I saw that TensorBoard web interface had the download option, but that is definitely too inconvenient for what I'm trying to do. Is there an additional Logger that I can run that stores the values in a convenient to read format for this sort of use? At the moment there's nothing fancy going into these TensorBoard logs, just scalars. Though it would be nice if this could accommodate model check-pointing too. I'm wary of pushing to a custom array and saving that to a BSON during training because that is not very robust to cases where the server crashes.

@PhilipVinc I haven't taken a look at the inner workings of this. Is it interfacing the python API with PyCall? If so, could this expose the summary_iterator I posted above? Additionally, the option to download CSV or JSON on the web interface is presumably exposed in the API somewhere, maybe it is possible to access whatever that web interface is doing?

PhilipVinc commented 5 years ago

No, I avoid working with PyCall. I had created TensorBoardLogger exactly to avoid installing PyCall on the cluster where I run simulations. It is completely independent of Python and TensorBoard, as such you can't use it to call into the Python API.

I'm not familiar with summary_iterator, but the read_event(f::IOStream) function above simply takes as input the event file (as an IOStream) and recreates the protobuffer from it. You can have a look at this gist, which does exactly what you want.

The limit of that gist is that (i) (I believe) it won't work if you append to an existing log and (ii) it works only with scalars. Addressing those limits would be easy if there was a decent way to check if a field in a struct is undefined, but the macro @isdefined does not work (@oxinabox maybe you know something about it?)

oxinabox commented 5 years ago

isassigned maybe? or ther function isdefined

tkf commented 4 years ago

FYI I start implementing the loader based on read_event https://github.com/tkf/TensorBoardLoader.jl

@PhilipVinc Is it OK to redistribute your code snippet like this?

Something like TensorBoardLoader.loadscalars(DataFrame, "logdir") seems to work (where DataFrame can be replaced with any table-receiving function). Loading non-scalars shouldn't be hard but I haven't implemented high-level APIs yet.

oxinabox commented 4 years ago

Something like TensorBoardLoader.loadscalars(DataFrame, "logdir") seems to work (where DataFrame can be replaced with any table-receiving function

Why not make loadscalars return a Table directly, so can just use the constructor of DataFrame (or invoke any table-receiving function) DataFrame(TensorBoardLoader.loadscalars("logdir"))

Also why a separate package, why not make a PR to this one?

tkf commented 4 years ago

Why not make loadscalars return a Table directly

That's a good question. This is because Julia does not have a resource cleanup API for the iterators. So, if you want to support lazy loading and close the files reliably, this has to use the callback interface (which is not so bad as you can use a do block). I'd guess many people don't care about it and OK with letting the GC handle file close. It's easy to add such an interface. I just don't need it personally ATM.

Also why a separate package

It's heavily using my package Transducers.jl so I wasn't sure about selling it with a PR, considering that the iterator transforms are the major interface for this kind of task. I also haven't used it enough to see if it makes sense to use lazy loader approach.

oxinabox commented 4 years ago

That's a good question. This is because Julia does not have a resource cleanup API for the iterators. So, if you want to support lazy loading and close the files reliably, this has to use the callback interface (which is not so bad as you can use a do block). I'd guess many people don't care about it and OK with letting the GC handle file close. It's easy to add such an interface. I just don't need it personally ATM.

Fair. One option might be to default the type to either Tables.RowTable (i.e. Vector of NamedTuples), or Tables.ColumnTable (i.e. NamedTuple of Vectors). So it would be eagerly done and cleanup could be done ?

PhilipVinc commented 4 years ago

@PhilipVinc Is it OK to redistribute your code snippet like this?

Yes of course

Also why a separate package, why not make a PR to this one?

Eventually I will add support for de-serializing TB data to TBLogger (I have already a basis for this in the read-serialized branch, but I did not have that much time to finish it yet). Eventually users will default to what is provided in the base package, I believe, unless your optional package provides something that really is worth the extra hops.

If you need this now and with your constraints, I understand, but otherwise does it make sense to implement this twice?

-- Having said that, I have the following questions:

So, if you want to support lazy loading and close the files reliably.

Is there any issue with not closing the files reliably, and letting the GC do the work? From the little I know computers can live with a couple (thousands) open files. And as we are opening them in read-only mode, is it that important to close them?

support lazy loading

Is this really needed for the case of TensorBoard scalar data? To my understanding you lazy-load things that take a lot of memory (or time) to load. Is this the case, especially for scalar data?

oxinabox commented 4 years ago

Is there any issue with not closing the files reliably,

I would need someone to confirm, but I think it might cause issues on Windows. I recall something about windows making open files exclusive. And you might want to be reading and writing? Not sure though

PhilipVinc commented 4 years ago

And you might want to be reading and writing?

On Unix platforms I'm confident this is a non-issue because I used to do it often in C++ (open a fstream in a file, and write from there, read it from another process). On Windows no idea. Please let me know.

tkf commented 4 years ago

Eventually users will default to what is provided in the base package, I believe, unless your optional package provides something that really is worth the extra hops.

@PhilipVinc I just needed this right now and so I hacked it up. My intention was a friendly "hey I implemented this, you can use this if you need it now" and not causing any conflicts or confusion.

From the little I know computers can live with a couple (thousands) open files.

Quick googling showed me

https://discourse.julialang.org/t/massive-readstring-too-many-open-files/3426 (Linux) https://superuser.com/questions/1356320/what-is-the-number-of-open-files-limits (Windows)

Is this really needed for the case of TensorBoard scalar data?

Maybe? I actually don't care because transducers are lazy and efficient by default. There is no trade off.

One option might be to default the type to either Tables.RowTable (i.e. Vector of NamedTuples), or Tables.ColumnTable (i.e. NamedTuple of Vectors).

@oxinabox That sounds like a good API. I need a good table integration for Transducers.jl and it would be nice to have some generic code for materializing a table (especially a column-oriented version) with the push-or-widen approach. It'll be a nice case study.

PhilipVinc commented 4 years ago

Don't worry! I wasn't complaining at all. I just dislike wasting effort. Ignore me.

Quick googling showed me

Yes you are right. but with TB you will have a bunch of files. hardly even hundreds... I'll still have a look at transducers.

tkf commented 4 years ago

I just dislike wasting effort.

I guess that's a typical instinct of good programmers. Sorry to set your nerves on edge.

with TB you will have a bunch of files

Maybe it can happen if someone builds a machine learning web service or something, using TB file loader? Also, IO close can cause other events like stop writing if it were a pipe. It makes sense to prefer a deterministic close. It's not hard to write code with reliable close so I think it's a good idea to just do it always.

PhilipVinc commented 4 years ago

Implemented in #57 , closing.