ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
15.83k stars 2.96k forks source link

more configurable datastore configs #3575

Closed whyrusleeping closed 6 years ago

whyrusleeping commented 7 years ago

Closes #3575

whyrusleeping commented 7 years ago

@Kubuxu could I get some early review on this?

Kubuxu commented 7 years ago

This branch allows for usage for new experimental datastore (sbs). To use it you have to create new fresh repo.

ipfs init --empty-repo
ipfs config --json Datastore.Spec '{"mounts":[{"child":{"path":"blocks","type":"sbs"},"mountpoint":"/blocks","prefix":"flatfs.datastore","type":"measure"},{"child":{"compression":"none","path":"datastore","type":"levelds"},"mountpoint":"/","prefix":"leveldb.datastore","type":"measure"}],"type":"mount"}'
whyrusleeping commented 7 years ago

@Kubuxu nice work so far :)

As an aside, we should look at adding json omitempty tags to the config fields we're deprecating, and start planning a migration path. I think we can pretty easily migrate people from the current format forward into the new format with an fs-repo-migration. At the same time we should adjust for 'new defaults' like the CORS thing that people who init'ed with old nodes might not have.

Perhaps with this migration, we even move the private key out of the config??

whyrusleeping commented 7 years ago

Idea:

Maybe try and ship this code in 0.4.8 (perhaps minus the sbs stuff for now) but keep the default config the same. Then in 0.4.9 we can have a migration to update everyone to the new format

whyrusleeping commented 7 years ago

sigh this keeps slipping. I really want to get it in soon...

kevina commented 7 years ago

@whyrusleeping is there anything I can do to help here?

whyrusleeping commented 7 years ago

@kevina Yeah, if you want to take this over. Drop the 'sbs' commit from @Kubuxu, and add some tests. That would be wonderful

kevina commented 7 years ago

@whyrusleeping @Kubuxu Should I should drop the "--wip--" commit also?

kevina commented 7 years ago

I dropped the last two commits, they are saved on feat/datastore-configs-wip for now.

kevina commented 7 years ago

A few notes before I am ready to finish this:

1) The Datastore specific config structs do not appear to be used. Should we make use of them or delete them. Right now the FlatDS stuct is in accuance as the code is still checking for "nosync" even the the FlatDS uses sync.

2) The config code doesn't seam to check for invalid keys (even outside the Datastore specific code) is this something we want to keep as is.

3) I will add "json omitempty tag" to the fields marked depicted as per @whyrusleeping suggestion.

whyrusleeping commented 7 years ago

@kevina yeah, i think we can remove the datastore specific structs. I wrote those before i figured out how this was gonna work. We can just use the interface map stuff.

Checking for 'disallowed keys' might be nice, though not super simple to do. If you want to, go ahead and add this checking, otherwise leave it unchecked.

your '3' SGTM

kevina commented 7 years ago

@whyrusleeping I agree with you on allowing either the old or the new config, but what exactly was the old config used for. What exactly would have happened if type was not leveldb, it seams like before the flatfs was hardcoded in.

whyrusleeping commented 7 years ago

@kevina the type field was never really used before.

kevina commented 7 years ago

Okay if the "Spec" field is non-empty I will use the new config, if it is empty I will use the old settings so that both will work for now. For completeness I will add back the old settings such as "type" and mark then as depicted and json omitempty.

kevina commented 7 years ago

@whyrusleeping as it is written now if I user creates a new repo using "ipfs init" wouldn't that technically be the new repo version as it will output an incompatable config? Should the version number reflect this?

Basically to do what you planned I think we need to support to repo version as given in ".ipfs/version". Does this sound right?

whyrusleeping commented 7 years ago

@kevina hrm... youre right. While it wouldnt need a migration forwards, it would need a migration back. Lets go ahead and bump the version number. Lets get this code ready, and worry about a migration once its ready to go

@Kubuxu @lgierth were there other things we wanted to change in the next migration? Something about default config values?

kevina commented 7 years ago

From a (private) IRC conversation started by @whyrusleeping:

whyrusleeping: Hey, on the datastore configs stuff kevina: yep whyrusleeping: Once you get that stuff working, it would be great to have a tool that is able to convert an ipfs repo into a given config whyrusleeping: so you can pass this tool the new config whyrusleeping: it opens up the repo, and transfers everything over whyrusleeping: might be a little tricky to do, but would be pretty nice kevina: this is for the migration tool right? whyrusleeping: yeah whyrusleeping: for migration, and then for if people want to try out different datastore configs whyrusleeping: say for example i want to use leveldb for everything kevina: you also need to be to able to do the reverse, that might be tricky if a user uses the new features whyrusleeping: yeah whyrusleeping: the tool can probably cover the 'easy' cases at first whyrusleeping: and then we can add in more logic later as demand for things changes kevina: in some cases it may be impossible' kevina: I would saw to a best effort and fail cleanly if the repo can't be downgraded whyrusleeping: Yeah, makes sense kevina: at least an upgraded repo by the tool should be downgraded whyrusleeping: yeah kevina: okay, i post these note on the issue for others to see, I will work on it tonight and tommorow whyrusleeping: great :) kevina: I am still unsure how exactly to handle testing though kevina: but one thing at a time :) whyrusleeping: yeap! one at a time whyrusleeping: Since we're going to do a migration anyways whyrusleeping: having the default config be the new style makes sense to me whyrusleeping: that way you don't really need to add any tests whyrusleeping: since all the sharness tests would be running with the new config kevina: that makes sense kevina: you still want to support both versions for now (old and new style config) kevina: ? whyrusleeping: i think we can probably drop the old style whyrusleeping: its pretty useless whyrusleeping: since it doesnt actually do much kevina: ?? but then users will have to run the migration tool for the next ipfs version, I thought we wanted to avoid that whyrusleeping thinks kevina: supporting both versions is not that big of a deal kevina: we can for now then remove support for the old version when it becomes a problem or after a fixed time whyrusleeping: ah, but the issue someone (i thought you) brought up was that if the user tweaks their config and it works for that version whyrusleeping: then downgrade to and older ipfs version whyrusleeping: with the same repo version whyrusleeping: it will break whyrusleeping: so as soon as the user tries out any new config things, the repo version needs to be changed kevina: a single version of ipfs can support two repo versions for now whyrusleeping: ah, youre saying we should bump the repo version with this change? kevina: something like that kevina: the only question is should "ipfs init" create the old or new style config and repo version whyrusleeping: right whyrusleeping: i would say the new one whyrusleeping: though thinking about it more, i think we should probably do a migration whyrusleeping: we have other things we wanted to fixup, defaults in the config file kevina: right you brought that up in the issue

kevina commented 7 years ago

@whyrusleeping @Kubuxu @lgierth was there any decision about how default values should be implemented anywhere? I am thinking they should be stored as metadata for the struct. Something like

type Foo struct {
  someField int `default:23`
}

Somewhat like how the json metadata tag is used.

whyrusleeping commented 7 years ago

@kevina thats an interesting idea, i kinda like it

kevina commented 7 years ago

Are there any config values we want to use this one (for example something to default to true)?

whyrusleeping commented 7 years ago

Well, theres the default config in repo/config/init.go, but thats really just default values for new repos.

whyrusleeping commented 7 years ago

I think that idea would be good for adding in new things, though its unclear if the implementation complexity would be high.

Would love to hear thoughts from @lgierth @Kubuxu @hsanjuan

kevina commented 7 years ago

All, @whyrusleeping and I had different things in mind. For this p.r. I will bump the repo version. My previous comments where trying to do something different from what @whyrusleeping had in mind. Sorry for the confusion.

whyrusleeping commented 7 years ago

@kevina no worries, clearly communicating requirements a skill i'm still getting better at :)

kevina commented 7 years ago

@whyrusleeping okay like I said I will bump the repo version and write a tool to migrate the config.

How should we handle the case when we can an old config, should we still support it, or error out? In order to error out we will need some way to detect if the old config is used since right now we ignore unknown keys.

whyrusleeping commented 7 years ago

I think we shouldnt handle the old config. The migration tool should rewrite the old syntax to the new syntax. If the old format is given with the new code, we should error out

kevina commented 6 years ago

@whyrusleeping ok I pushed an initial implementation which does what I think you want. The requested utility is at https://github.com/kevina/ipfs-config-5-6 for now.

kevina commented 6 years ago

Also please note that Datastore.Path in the old config was unused and was the absolute path of the datastore. This information is not stored in the new config so a round trip will lose that field. In my down conversion I just omit the field.

whyrusleeping commented 6 years ago

@kevina cool, changes looking good. We will want to add some more type checks, and probably some go tests that assert things get constructed successfully (have json inputs and check types on the results).

The next step after that is to take the tool you wrote, and work it into an ipfs-5-to-6 migration.

kevina commented 6 years ago

@whyrusleeping I addressed your comments, but have not added any go tests. Some guidance in that area would be helpful.

whyrusleeping commented 6 years ago

@kevina i'm thinking of having a few tests that have a json 'spec' field (what the user would have in their config) and then unmarshal that to a map[string]interface{}, and then calling constructDatastore on that. Then check that the object returned from constructDatastore is what we expect, (e.g. a measure datastore wrapping flatfs, or a mount datastore with multiple leveldb instances mounted under it at different points)

kevina commented 6 years ago

@whyrusleeping Okay I can do that, tested for nested data structures may be tricky but I will see what I can do

whyrusleeping commented 6 years ago

@kevina Yeah, it might be tricky for nested things, but i think with some of them you can inspect their internals a bit (like for the mount datastore).

kevina commented 6 years ago

@whyrusleeping the mount datastore does not expose the mounts so inspecting the internals will be problematic. I can use reflect to get some information but because the mounts are accesses via an unexported filed I can't get call Interface() to get to the mount type. I will have to keep digging into the internals of the measure and other datastores in order to fully inspect the nested datastores. Here is what I have so far: https://gist.github.com/kevina/50c1255543cbc8a674a30c7f6e674abf

Please advice.

whyrusleeping commented 6 years ago

@kevina hrm... I think for now, i'm fine just having checks that individual datastore types get constructed correctly. Moving forward (can be after this PR) we should add methods to datastores concrete types that allow introspection.

Go ahead and skip the in depth reflection checking for now, I didnt realize so many things werent exported.

kevina commented 6 years ago

@whyrusleeping I took a stab at the conversion code see https://github.com/ipfs/fs-repo-migrations/pull/62

whyrusleeping commented 6 years ago

I'm thinking that we should put some sort of file in the repo outside of the config that denotes the current repo format. That way, to change the datastore format, one could simply change the config, and restart ipfs. On startup, ipfs would check that the repo format matches the config, and if they differ, prompt the user to run a tool that will convert to the new format.

One approach here would be to take the json data of the datastore config, and canonically encode it to minified json, then base64 encode that. That way users won't mistakenly think they can change the datastore by just changing that file.

thoughts @kevina @Kubuxu @magik6k ?

magik6k commented 6 years ago

+1, I've ran into this multiple times when testing badger ds.

kevina commented 6 years ago

One approach here would be to take the json data of the datastore config, and canonically encode it to minified json, then base64 encode that. That way users won't mistakenly think they can change the datastore by just changing that file.

I am against any sort of intentional obfuscation like that. A simple comment at the top stating what the file is and not to modify it is better. Yeah I know JSON doesn't support comments, but we could easily strip all lines at the start of the file that start with say '#' before passing into the JSON reader.

lbeziaud commented 6 years ago

Yeah I know JSON doesn't support comments, but we could easily strip all lines at the start of the file that start with say '#' before passing into the JSON reader.

Or you could simply include the comment as JSON. Maybe not as readable but the file would remain standard.

{
    "_comment": "my comment",
    "data": "my data"
}
kevina commented 6 years ago

Or you could simply include the comment as JSON. Maybe not as readable but the file would remain standard.

Yeah, but that would not be as obvious.

whyrusleeping commented 6 years ago

hrm... okay. I can be okay with using json with a comment. I don't know if i like the idea of adding a comment to the json itself (as @lbeziaud suggests) since that makes it more difficult to compare, and also a bit less obvious to any potential tinkerers reading the file.

kevina commented 6 years ago

I'm thinking that we should put some sort of file in the repo outside of the config that denotes the current repo format. That way, to change the datastore format, one could simply change the config, and restart ipfs. On startup, ipfs would check that the repo format matches the config, and if they differ, prompt the user to run a tool that will convert to the new format.

I think it is a good idea.

We would still need a repo version upgrade. The upgrade will in addition to changing to the new style config also create the database definition file.

The datastore definition file should probably not be identical to the config, there are some parts that can be changed without an upgrade.

whyrusleeping commented 6 years ago

Yeah, definitely still need a migration.

I would argue that for simplicitys sake, the file should be exactly equal to the config, the tool that does the migrations should be able to detect cases where we can just no-op

On Thu, Jun 29, 2017, 00:19 Kevin Atkinson notifications@github.com wrote:

I'm thinking that we should put some sort of file in the repo outside of the config that denotes the current repo format. That way, to change the datastore format, one could simply change the config, and restart ipfs. On startup, ipfs would check that the repo format matches the config, and if they differ, prompt the user to run a tool that will convert to the new format.

I think it is a good idea.

We would still need a repo version upgrade. The upgrade will in addition to changing to the new style config also create the database definition file.

The datastore definition file should probably not be identical to the config, there are some parts that can be changed without an upgrade.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ipfs/go-ipfs/pull/3575#issuecomment-311858423, or mute the thread https://github.com/notifications/unsubscribe-auth/ABL4HINWaovMWWSMvDyKqbSSpdUFdNOiks5sIyWzgaJpZM4LdqsH .

kevina commented 6 years ago

I would argue that for simplicitys sake, the file should be exactly equal to the config, the tool that does the migrations should be able to detect cases where we can just no-op

My point is that there are some options that don't effect the datastore layout, you example the sync in flatfs. This option can be changed without requiring a migration.

whyrusleeping commented 6 years ago

@kevina i get that. And my point is that discerning what would require a migration and what wouldnt isnt trivial, so we should always run the tool. The tool itself can detect these scenarios that don't actually require any work to be done and exit out early

whyrusleeping commented 6 years ago

Then, if someone really wants to avoid the tool running, and they know what they are doing, they could make the same modifications in both files

kevina commented 6 years ago

@whyrusleeping that seams like an unnecessary annoyance and its probably not the way I would do it, but I don't want to argue the point.

whyrusleeping commented 6 years ago

@kevina there shouldnt be anything annoying about it. You make a change to the config and start ipfs, just like happens now. Users won't even notice the migration tool running (unless there are configuration change issues)

kevina commented 6 years ago

@whyrusleeping but you also said

On startup, ipfs would check that the repo format matches the config, and if they differ, prompt the user to run a tool that will convert to the new format.

That gave me the impression that the user will be prompted when ever there is a change in the database config, hence the annoyance part.