dfdx / Boltzmann.jl

Restricted Boltzmann Machines in Julia
Other
67 stars 27 forks source link

Implicit conversion of data types #25

Closed dfdx closed 8 years ago

dfdx commented 8 years ago

From #19:

It would be nice if the precision was a bit more flexible. I might want an Float32 RBM, but it isn't maintainable for me to update all of my code to convert Float64 context variables and input data to Float32.

It sounds like a good idea, especially in a context of fitting large datasets when a user may not have possibility to convert an entire dataset in memory, but converting it on per-batch basis is still possible. However, I have some concerns reagarding this idea.

Does it make sence or am I just paranoid?

cc: @Rory-Finnegan

rofinn commented 8 years ago

Addressing each concern in order.

  1. I agree that explicit is better than implicit, but given the verbosity of converting everything before and after I imagine there is a middle ground. Given that we are explicitly stating the precision of the RBM at creation time I don't think sticking to that precision for all calculations is an unreasonable assumption from the user's perspective. That being said I think it would be reasonable to convert all input to the RBM precision and then convert it back if necessary afterwards.
  2. I think at the very least all exported functions should support implicit conversion.
  3. How would your example result into a performance loss? Wouldn't you be doing a check before the conversion? In the example given you'd do the conversion on fit and sample wouldn't need to convert anything cause the input precision at that point would match the RBM's. Now obviously, if you wanted to call sample directly with different precision then you would need to do a conversion.

Does that make sense or am I missing something?

dfdx commented 8 years ago
  1. Sounds reasonable. I don't think we need to convert output to its original precision, though. Consider, for example, transform() function in deep networks - what a mess it would be if we were converting data back and forth on each layer. I think if we state that, say, RBM's precision is 32 bits, then the output should always be 32 bits, regardless of the input (note that in this case we would have already lost actual precision).
  2. That's the trick - in Julia there's no strict "public" and "private" space as in e.g. Java, so even if some function is not exported, it is still callable from outside.
  3. This way we couple fit and sample, assuming particular implementation and workflow. If, however, call to sample comes from some other place (e.g. fit(NewUnusualRBM)), this assumption will be broken and it's hard to predict consequences.

All in all, I'm just worried about increased number of possible workflows. On other hand, until we get to version 1.0 we are free to change API, so I'm going to give it a try anyway and then see how it goes.

rofinn commented 8 years ago
  1. I'm not as familiar with the DBN stuff, but I would assume that the DBN would also have a precision that should match its layers, so wouldn't it just convert any input once and the layers wouldn't need to do any conversion?
  2. Yes, I kind of like the idea of assuming all unexported items are private. I'm also partial to python's approach of adding _ to the beginning of private methods and types to make it visually apparent what is "public" vs "private".
  3. Ah, okay. To clarify, your concern is that while fit(::RBM, ::Mat{T}...) calling sample_hiddens(::AbstractRBM, ::Mat{T}) wouldn't have any performance issues, it could if someone's fit(::NewUnusualRBM, ::Mat{T}) doesn't do the conversion and still calls sample_hiddens(::AbstractRBM, ::Mat{T})? That does seem like a problem and I can't think of a good solution.

I think generalizing the tests cases and adding benchmarks might help with handling the different workflows. I'll open an issue for that and start working on it.

Thanks.

dfdx commented 8 years ago
  1. DBNs are simply stacks of RBMs, each of which is trained separately - check out a relevant section from nets.jl. It's ok to have flow like data{Float64} -> RBM_1{Float32} -> data{Float32} -> RBM_2{Float32} -> ... (i.e. convert data to Float32 once and pass through all RBMs), but I'd like to avoid flows like data{Float64} -> RBM_1{Float32} -> data{Float64} -> RBM_2{Float32} -> (i.e. convert between Float32 and Float64 between each pair of RBMs).
  2. In Python, _ prefix works as a "soft private" modifier, i.e. from M import * does not import objects whose name starts with an underscore. In Julia, visibility of methods is controlled via explicit exports, so I don't see any advantage of using leading underscores in method names. At the same time, I'm not against it, so I don't mind using it for "strictly private" methods like _get_dataset.
  3. Exactly. I have some more examples in mind, but not sure how much realistic they are. Benchmarking sounds like a good idea, just take into account #30 which uncovers some restrictions for tests/benchmarks.
dfdx commented 8 years ago

Fixed by #33