ctdk / goiardi

A Chef server written in Go, able to run entirely in memory, with optional persistence with saving the in-memory data to disk or using MySQL or Postgres as the data storage backend. Docs: http://goiardi.readthedocs.io/en/latest/index.html
http://goiardi.gl
Apache License 2.0
280 stars 38 forks source link

V2 Proposal #80

Open alekc opened 4 years ago

alekc commented 4 years ago

While working on several issues on goiardi project it became painfully obvious that due to the age of it and its legacy its not easy to deal with changes and (especially) testing because of the constant need to try to adapt "modern" approach and not break existing features/legacy.

I'd like to propose a v2 version of goiardi which hopefully permit us to be able to deal with several issues in much more optimal manner. I think most of the project related to chef (requests, validations, cookbook searching, etc) can be reused so we should be able to get away with only minor blood.

Main points:

Thats it for now, let me know your thoughts.

alekc commented 4 years ago

Added additional point about vendoring.

ctdk commented 4 years ago

I apologize for this response taking so long - I worked on it for a long time yesterday, until my left arm started both going numb and hurting a great deal. :-( There are many good points here. Here are my thoughts, along with some extra information:

To address some specific points:

Testing and goiardi have an interesting history. I've tried to add go tests where it's practical, and there's more in the 1.0.0 branches, but those haven't always worked as well as I would have hoped. Since it's available, I also rely heavily on the various versions of chef-pedant (for 11.x, the old standalone 12.x oc-chef-pedant, and now I'm dipping my toes into the current oc-chef-pedant in the chef-server repo).

The pedant suite does provide a useful way to test every aspect of the server, in ways that don't work as well while testing internally, and has exposed countless problems. Unfortunately, it does have some downsides. It's also pedantic about error messages, so there's been a lot of work making the error messages fit what it expects them to be. There are a lot of tests in place to make sure it will still work with ancient cookbooks, but this may be better with the more recent test suite.

Some tests also have needed to be changed to fit what I want goiardi's behavior to be. This is especially the case with authentication, which is much further ranging than you'd expect. Basically, goiardi does its authentication checks first, with allowances to be made for endpoints that don't require authentication, and then moves on to the request handlers. The official chef server does it the other way around, so pedant will flip because it expects a 404, 405, etc. instead of a 401 or 403 error. I think it's better to do it the way I did, so I went and changed the tests. There are also a handful of tests that just plain don't work for utterly trivial reasons, but at the same time are kind of hard to find where exactly they're coming from. The tests in question are for abusing the metadata to do things like make the description an array or setting a metadata entry to nil. Chef sweeps these kinds of issues under the carpet because formerly all that information was stored as JSON documents in CouchDB, and now they're gzipped JSON blobs in Postgres. Since goiardi uses more proper database tables, which I stand by, translating back and forth is a bit more complicated.

All that said, there's a considerable amount of overhead with the various pedants too. It has a lot of dependencies and can be pretty finicky. There are also many tests that are less relevant to non-Chef Software Inc. users, tests for internal endpoints that goiardi doesn't use, and tests for old behavior and weird JSON abuse you shouldn't actually be able to do. Wading through all the rspec is itself an ordeal, so I'm open to alternate suggestions for at least normal testing.

One thing I've been thinking about lately is how 1.0.0 is basically ready and has been for a while. The holdup's been updating the documentation and testing out Policyfiles. I also usually did most of my goiardi work commuting to and from work, which unsurprisingly is no longer the case. The ACL work was also extremely draining and is one of two big reasons 1.0.0 development stalled (the other being that I didn't have an actual use case for organizations for a long time). Getting the ACLs to work in the way Chef Server's do was both arduous and full of false starts. I even ended up throwing out a few years of work and starting over last year. Ironically, actually mucking about with the ACLs seems to be discouraged now.

Basically, I'm looking at finally letting go and not trying to catch up to the absolute latest version of Chef before putting out 1.0.0 is the way to go, and instead of devoting energy to battling with the latest pedant and its changes to get the current code ready, merging in those changes you've made lately (and thank you for those, btw), and then doing this stuff. What do you think?

ctdk commented 4 years ago

I may be overlooking it, but where's the vendoring point?

alekc commented 4 years ago

I may be overlooking it, but where's the vendoring point?

No idea where it went :| My point was: lets abandon vendoring and go completely with go.mod, as far as I can see, we don't use any private dependencies, so we can cover all existing vendoring with go.mod

I need to check out echo to compare, but if it has significant advantages hopefully the switch woldn't be too difficult since the general concept is in place

Yes, it should not be too much of an issue passing from one to another. Tbh, gorilla mux is already much better compared to what was there previously, but I still prefer the syntax/functionality of echo a bit more 😉 Also, return headers (such as json, etc) are set automatically based on return type, and its a bit easier to catch the output before its being emitted (i.e. for logs).

Additionally many of the handlers have been redone from how they look in master.

Yes, I can see them in 1.0.0 branch, looks much better indeed. I suppose at the end we would have all handlers to be structured in a specific methods like you did for clients?

    s.HandleFunc("/clients", clientListHandler).Methods("GET")
    s.HandleFunc("/clients", clientCreateHandler).Methods("POST")

Goiardi does export some metrics to statsd

yes, we amended statsd reporting (to include some additional bits) in one of our commits (you can find it in pr relative to the memory leak), opencensus would permit us/consumers to tackle following points:

In regards to data storage: MySQL's already been dropped in 1.0.0-dev, because Postgres both has more useful features and because there's only so much of me to go around.

Make sense, I would say once we have a properly defined interface for data layer, if someone need mysql it should be fairly trivial bring it back without any big changes to the main code structure.

The local datastore is definitely not the best solution now and hasn't really been once it moved past being a learning project, if that makes sense. I went with implementing a key/value store inside because it was easier initially, and then was leery of breaking backwards compatibility. I've definitely thought about sqlite for goiardi. One thing that did present a bit of a stumbling block when I first thought about using sqlite was trouble cross compiling goiardi because of the cgo requirement. That being said, it can be worked around (or just not worry about providing the more esoteric binaries like for NetBSD or linux/s390x).

Several points:

I'm a little more loathe to change the db migrations, since the sqitch managed setup it's currently using does work pretty darn well. It does require an external tool, though, so I'm open to suggestions.

For our scenario (running in kubernetes) db init/upgrade was a bit of a sore point (at the end we ended up using a pgsql init script and sending off the dump file). It didn't help that sqitch is perl based, so it requires some specific perl environment and prior knowledge in using perl. But its not a deal breaker for now, and can be tackled later on.

Validations, json, chef inspec, etc.

While rewriting the cookbook handling part, I noticed that once I have converted anon json to a proper json struct, I could delete almost all of validation logic because its was taken care of by the json unmarshalling function (i.e. you cannot place a string inside a num, or if there is a nil value for an array, json unmarshal would declare an empty array bit), so I would be very tempted to define our structure depending on valid requests and in case of invalid json arriving in request, just refuse it with a StatusBadRequest and a reason inside the body/logs.

This brings me to another point which I have expressed previously in one of our MR: Chef-pedant, should we rely heavily on it like we are doing now? The pedant suite is a tool to test chef server, but for something like goiardi, I think we should take as a first priority the functionality from the client point of view (as in using cinc-client, knife, chef-client). Like it happened with knife user upload for example (https://github.com/ctdk/goiardi/issues/75) where existing logic (presumably taken from chef-pedant?) was breaking the actual end user experience and preventing them from making a valid request. I dont think there is a lot of value for goiardi to be returning the error message in exact same way as the official chef server does (if it's not used for validation logic inside chef-client/knife)

I think the best course of action for testing (besides the unit testing for specific functions), would be to have a set of valid requests (i.e. cookbook uploads, user creation, deletion, etc) and run them through goiardi handlers. I would imagine them working in following fashion:

Basically, I'm looking at finally letting go and not trying to catch up to the absolute latest version of Chef before putting out 1.0.0 is the way to go, and instead of devoting energy to battling with the latest pedant and its changes to get the current code ready, merging in those changes you've made lately (and thank you for those, btw), and then doing this stuff. What do you think?

Probably trying to catch up to the latest version would not be necessary tbh. Whats the current situation on the acls and orgs, are they finished? If you are not against the restructure of the whole goiardi code like I outlined above, it might be better to wait a bit and deploy the refactored (/rewritten) version, after all we cannot deploy 2 major version with potentially breaking changes twice in a short period of time.

alekc commented 4 years ago

Oh, and by the Data interface I meant something like this:

type DataLayerStruct interface {
    BeginTransaction() error
    Commit() error
    Rollback() error
    GetCookbook(ctx context.Context) cookbook.Cookbook
    UpdateCookbook(ctx context.Context,cookbook2 cookbook.Cookbook) error
}

where BeginTransaction/Commit/Rollback in case of in memory for example would probably be some dummy functions

Also added point about https://github.com/golangci/golangci-lint

ctdk commented 4 years ago

It sounds like we're in general agreement, then. First, I'd like to address a few of your points:

All that said, not relying on chef-pedant for normal testing would be pretty nice.

Additionally:

And finally, I suppose there's not really a reason to cut a release of the current 1.0.0 work right now. I was thinking about making one as a checkpoint, as it were, but taking the time to make it more better is better in the end.

Thank you for all your work and ideas that you've had so far, by the way. I apologize for being kind of slow on this lately.

alekc commented 4 years ago

No worries :)

For the next couple of weeks or so I will be busy with deploying the current version of goiardi to prod, once that's over and we are stable I will be able to slowly work on the new implementation. I will post updates once I have something usable.

Leaving this open for further discussion.