Open mrsndmn opened 3 years ago
There are some disadvantages of current onlineconf-go interface:
Common
- panics I don't see any reason this library to panic. Possibly it would be acceptable to raise panic inside methods like
Must{Int,String,Reloader,Module...}
. But panics difenetly not appropriate insidenewModule
andinitWatcher
. Errors must be returned instead of panics.
Yes, panics are not desirable in most cases. Especially if there is a way to gracefully handle errors returned by a function. Do you see a way to handle errors in all hundreds calls to Get...()
spread around the code? Almost always a service is not usable without the configuration.
Reloader
I don't think it is a best practice to create goroutines inside any method of util library with no ability to handle errors. Possible solutions:
- return error chan to handle errors that happened inside initWatcher/module reloader
I don't think a user can do something with these errors except to log them.
- do not create goroutine. Creating goroutine is a responsibility of onlineconf-go's user. Personally I like this approach more.
The ability to create goroutines in a library through the only standard API is a strong point of the Go language. I don't see a reason to drop this advantage.
This approach let us drop log dependency.
Is it a problem? This is quite a common pattern. It is not a dependency, it is a possibility to redefine the logger.
Module
- global state/global variables
Initialize()
is a crutch. Any (almost all) global variable complicates testing and decreases flexibility of library. I guess it would be better to provide interface like this:
A simplified testing must not lead to a complicated usage. Agree what a global state is not a good solution. But passing a reloader through all libraries may be annoying.
NewModuleReloader(&ReloaderOptions{Name: moduleName, Dir: moduleDir}) // we also could provide simplified interface for default module dir: // Reloader returns reloader for specified module in default module dir func Reloader(moduleName string) (*ModuleReloader, error) { // MustReloader returns reloader for specified module in default module dir. Panics on error. func MustReloader(moduleName string) *ModuleReloader { module := relaoder.Module()
Almost always MustReloader
will be used.
Module should not be saved in global map.
I'd prefer not to mmap one file twice if reloader was created twice in two unlinked libraries.
Getters
- Getters (
module.Int(..)
,module.String(..)
) methods must return errors if there was one while reading from mmapped file or parsing parameter string (for ints).
What do you expect a user must do if an error happened? They must get a value. Right now they get a default value if a parameter not exists or an error happened.
- go native maps are faster and safer than reading even from mmapped cdb file. I guess it would be convenient to cache all parameters. todo reproduce benchmarks (i benched it too long ago and lost the snippet)
Reading from a mmaped file is a memcpy. Why do you think it is unsafe? Read locks are required in the both cases. I do not expect significant advantage of go map if any.
- each getter method must have default value Methods like
*IfExists
are redundant. Requiring the second argument to be passed will simplify interface of onlineconf-go.
Agree in most cases. The only case when they can be useful is if a default value must be calculated or fetched from some storage.
val, err := module.Int(path, 0) if isErrConfigParamNotFound(err) { // handle err }
Do you really want to write such checks every time?
Path validation
Invalid parameters paths sometimes can lead to hurtful mistakes.
Path validation would be more reasonable in compile time.
Golang has no compile time hooks. Of course a validation is always desirable but not by the cost of the readability of the code.
intPath := MustParamPath("/path/to/param/int") # panics on invalid parameter path ocInt, err := module.Int(intPath)
Sometimes it is also useful to specify onleneconf parameters path prefix and further join prefix with sub-config parameters. Considering all this I formulated properties of valid onlineconf paths:
- Onlineconf path can't end with
/
.- Onlineconf path must be prefixed with
/
.- Single slash
/
is a valid onlineconf path.- Double slash
//
is prohibited
From an application point of view OnlineConf parameter path is just a string. There are no ways to get siblings or iterate through children. Moreover depending of a module configuration the path can be in the two forms: unix-like or dot-separated.
Problem: nobody cares on handling errors in onlineconf getters and in PathValidations. I guess that the most frequent way to retrieve parameter value would be
module.MustInt(MustParamPath("/onlineconf/test"))
Yes. Totally agree. So is it reasonable to complicate?
Runtime panics are undesirable. So we can move parameters paths check to compile time with specifying default value right there.
Golang has no compile time checks. Anything will be in the runtime. Sooner or later.
var VeryImportantParam = MustParamPathInt("/onlineconf/test", 0) // path + default func main() { reloader := MustReloader("TREE") // config reloader form default config-dir module := reloader.Module() // consistent copy of module params val, err := module.Int(VeryImportantParam) // also with default if err != nil { // nobody would check this error but we have to pass it out because we don't know how to handle it // here could be useful errors e.g. can't parse param } ... }
Too hard to use isn't it?
further work
Not sure these features are reasonable right now or at all
The most safe way to work with onlineconf I imagine:
- Declarative parameters description
Yes. OnlineConf is schemaless now. I'd prefer to have a schema but probably it will be too complicated to implement and migrate.
Reloader reads and caches all declared parameters.
- if there are any error in config (parameter type is wrong or required parameter is missing) reloader goroutine would be able to handle this error and not to update current copy of module.
- there is also disadvantages of thus approach: application with broken config won't be able to restart
- sometimes you don't know the real path of parameters you want to get. There are 2 solutions: use json parameters for dynamic config keys or allow reloader to cache parameters by regexp/template etc
With this approach getter never return error because all the parameters are cached.
You really don't want to get random values from cache by different instances depending of their start time. Right now getters return a default value in case of errors. This is more stable.
Right now there are two scenarios when onlineconf-go panics:
First case can be solved by requiring to call Get...()
on already opened cdb. Just removing shortcuts for "TREE"
will be enough.
The second one can be solved by moving from Get...(path, ...)
to Get...(path, default)
.
Reloader
must be created in application, not in external library.Module
could be passed with contextThis approach let us drop log dependency.
Is it a problem? This is quite a common pattern. It is not a dependency, it is a possibility to redefine the logger.
And no possibility to change log format for example. Do you know any example of popular go module with error logging inside?
global state/global variables
A simplified testing must not lead to a complicated usage. Agree what a global state is not a good solution. But passing a reloader through all libraries may be annoying.
ModuleReloader
would be suitable as external service constructor argument. But to keep consistent Module
per request it would be better to pass through Module
object. Also you are able to create context with Module
.
However, could you provide any alternative solution?
Module should not be saved in global map.
I'd prefer not to mmap one file twice if reloader was created twice in two unlinked libraries.
External libraries must not create their own Reloader
s. I suppose either Reloader
should be used in external library object constructor or Module
passed with context.
Getters ( module.Int(..), module.String(..) ) methods must return errors if there was one while reading from mmapped file or parsing parameter string (for ints).
What do you expect a user must do if an error happened? They must get a value. Right now they get a default value if a parameter not exists or an error happened.
Ok. I'd love not to return errors on getters too.
go native maps are faster and safer than reading even from mmapped cdb file
Reading from a mmaped file is a memcpy.
Well yes, but actually no!
wiki
[mmap] implements demand paging because file contents are not read from disk directly and initially do not use physical RAM at all. The actual reads from disk are performed in a "lazy" manner, after a specific location is accessed
Why do you think it is unsafe? I was stuck on how mmapped file reading errors are returned. The answer is SIGSEGV. Nice! Now I agree there is no need returning error for getters at all.
I do not expect significant advantage of go map if any.
pushed benchmarks.
go test -run=^$ -bench . -benchmem -benchtime=100000x
goos: linux
goarch: amd64
pkg: github.com/onlineconf/onlineconf-go
BenchmarkMmappedCdb-2 100000 252 ns/op 48 B/op 5 allocs/op
BenchmarkGoMap-2 100000 12.0 ns/op 0 B/op 0 allocs/op
With no mmapping:
BenchmarkOrdinaryFileCdb-2 100000 2274 ns/op 48 B/op 5 allocs/op
You see map is up to 20 time faster due to there is no useless allocations with string <-> byte copying.
Path validation Invalid parameters paths sometimes can lead to hurtful mistakes. Path validation would be more reasonable in compile time.
Golang has no compile time hooks.
Sure! I meant runtime as soon as possible as a global variables.
Of course a validation is always desirable but not by the cost of the readability of the code.
With purposed approach readability doesn't suffer.
All the parameters defined as global variables in one place. Then defined params could be used with no fear of misprint. It would be better than a ton of params scattered around the code.
From an application point of view OnlineConf parameter path is just a string. There are no ways to get siblings or iterate through children. Moreover depending of a module configuration the path can be in the two forms: unix-like or dot-separated.
So what?
You really don't want to get random values from cache by different instances depending of their start time.
What do you mean?
Right now getters return a default value in case of errors.
Me too
And no possibility to change log format for example. Do you know any example of popular go module with error logging inside?
grpc-go?
ModuleReloader
would be suitable as external service constructor argument. But to keep consistentModule
per request it would be better to pass throughModule
object. Also you are able to create context withModule
.
You must have the only opened file per a module and must close the file once it was replaced. In the other case you will run out of file descriptors sooner or later.
External libraries must not create their own
Reloader
s. I suppose eitherReloader
should be used in external library object constructor orModule
passed with context.
In this case adding a configurable parameter to a deeply nested library may be painful and will require changing an API without a backward compatibility.
Reading from a mmaped file is a memcpy.
Well yes, but actually no!
wiki
[mmap] implements demand paging because file contents are not read from disk directly and initially do not use physical RAM at all. The actual reads from disk are performed in a "lazy" manner, after a specific location is accessed
Yes. This is why we use mmaped cdb in all our onlineconf client libraries. Page is read from the disk only the first time a configuration parameter located in it was read. After that this is just a memcpy. So it doesn't affect performance. Moreover even the first time with a very high chance page already will be in the memory because onlineconf-updater just wrote it.
Why do you think it is unsafe? I was stuck on how mmapped file reading errors are returned. The answer is SIGSEGV. Nice! Now I agree there is no need returning error for getters at all.
This is absolutely OK. You will never get SIGSEGV from a mmaped file on a healthy system. More than that if you break you drive while your application is in swap you will have the same result because underling logic is the same. Does anyone care about this?
I do not expect significant advantage of go map if any.
pushed benchmarks.
go test -run=^$ -bench . -benchmem -benchtime=100000x goos: linux goarch: amd64 pkg: github.com/onlineconf/onlineconf-go BenchmarkMmappedCdb-2 100000 252 ns/op 48 B/op 5 allocs/op BenchmarkGoMap-2 100000 12.0 ns/op 0 B/op 0 allocs/op
With no mmapping:
BenchmarkOrdinaryFileCdb-2 100000 2274 ns/op 48 B/op 5 allocs/op
You see map is up to 20 time faster due to there is no useless allocations with string <-> byte copying.
The benchmarks are too synthetics. They compare two relatively fast operations. There are even no locks. If you benchmark GetString
method implementations you will see that the difference is less but both cases are still very fast. After that you can remove ReaderAt from the underling cdb/mmap libraries and use direct access to the mmaped region. And finally you can use string
instead of []byte
in function input/output arguments. As you can see there are 5 allocs/op but only one is really required. In the end you will see what the timings are comparable.
But before all of this the main question is: does somebody really must read configuration parameters at such high rate or they just must fix their code?
You really don't want to get random values from cache by different instances depending of their start time.
What do you mean?
There are a lot of servers running a lot of instances. This instances are started at different time. They must have the same configuration. If you fallback to cached value in the case of integer parsing for example you can get two different values in two instances if one of them was started before somebody made a mistake (cached value from previous version) and another after (default value).
There are some disadvantages of current onlineconf-go interface:
Common
Must{Int,String,Reloader,Module...}
. But panics difenetly not appropriate insidenewModule
andinitWatcher
. Errors must be returned instead of panics.Reloader
Interaface of module reloader would be like:
For simple cases (short-running scripts) reloader watcher is not needed at all.
Module
Initialize()
is a crutch. Any (almost all) global variable complicates testing and decreases flexibility of library. I guess it would be better to provide interface like this:Module should not be saved in global map.
Getters
Getters (
module.Int(..)
,module.String(..)
) methods must return errors if there was one while reading from mmapped file or parsing parameter string (for ints).go native maps are faster and safer than reading even from mmapped cdb file. I guess it would be convenient to cache all parameters. todo reproduce benchmarks (i benched it too long ago and lost the snippet)
each getter method must have default value Methods like
*IfExists
are redundant. Requiring the second argument to be passed will simplify interface of onlineconf-go.If you care for check if parameter exists anyway you have to check error. In case you don't care of default value you must explicitly passs
0
.Path validation
Invalid parameters paths sometimes can lead to hurtful mistakes.
Path validation would be more reasonable in compile time.
Sometimes it is also useful to specify onleneconf parameters path prefix and further join prefix with sub-config parameters. Considering all this I formulated properties of valid onlineconf paths:
/
./
./
is a valid onlineconf path.//
is prohibitedProblem: nobody cares on handling errors in onlineconf getters and in PathValidations. I guess that the most frequent way to retrieve parameter value would be
Runtime panics are undesirable. So we can move parameters paths check to compile time with specifying default value right there.
further work
Not sure these features are reasonable right now or at all
The most safe way to work with onlineconf I imagine:
With this approach getter never return error because all the parameters are cached.