marklogic-community / roxy

Deployment tool for MarkLogic applications. Also provides optional unit test and XQuery MVC structure
Other
87 stars 66 forks source link

Bootstrap wipes undeployed indexes #691

Open RobertSzkutak opened 8 years ago

RobertSzkutak commented 8 years ago

A customer shared with me yesterday that one of their pet peeves is that bootstrapping will wipe indexes not defined in ml-config when performed. Frankly, this one of my pet peeves too. I would like to modify Roxy to not do this by default. This may also be a good reason to revisit our discussion of deprecating assets here : #681

dmcassel commented 8 years ago

It does that as the only way to remove indexes at all. The way to think of bootstrap is "apply the configuration in my files to my MarkLogic instance." Users are strongly encouraged not to manually add indexes for this reason.

Perhaps this could be addressed by improving and documenting the capture functionality. In my opinion, changing bootstrap to not remove indexes would be problematic.

On Nov 16, 2016 5:17 AM, "Robert Szkutak" notifications@github.com wrote:

A customer shared with me yesterday that one of their pet peeves is that bootstrapping will wipe indexes not defined in ml-config when performed. Frankly, this one of my pet peeves too. I would like to modify Roxy to not do this by default. This may also tie into our discussion of deprecating assets here : #681 https://github.com/marklogic/roxy/issues/681

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/marklogic/roxy/issues/691, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqz0tGcyEjd-QmLr5A8u2ENqAWlh0Vwks5q-wJVgaJpZM4Kz2DY .

RobertSzkutak commented 8 years ago

I'm seeing a growing number of usecases where people are developing multiple apps with Roxy against the same content database. These users want to be able to bootstrap/wipe only the indexes used by their application and not worry about breaking anyone else's application.

I'm also seeing a growing number of instances where someone adds an index manually, forgets to add it to ml-config.xml and then accidentally wipes it in a bootstrap.

Several users have also commented to me that they see the index wiping behavior as 'unexpected' because, for example, Roxy doesnt wipe databases or roles not defined in ml-config,xml.

Users prefer to use Roxy to quickly add lots of indexes to all of their environments but they fear the unintended consequences of doing so.

While you can chalk such scenarios up to people/governance problems, most people (especially people on large projects running on the same database as other large projects) lack the foresight to handle these scenarios correctly and instead run headfirst into nasty problems.

The way I see it, there are several options here for us to help provide users a better experience :

1) Provide a commandline argument to turn on/off wiping indexes when bootstrapping 2) Modify default behavior to not wipe indexes when bootstrapping 3) Provide better documentation and warnings about the aforementioned scenarios. Perhaps also add a command to show what would be wiped/added if a bootstrap were run.

I'm personally in favor of all of the above. I feel like that would do the most to limit the possibility for disastrous user error. However, I acknowledge that there will always be some scenarios that only better development practices and governance can solve.

dmcassel commented 8 years ago

I can see the challenges with managing multiple apps with one content database. I'd be onboard with your suggestions #1 and #3, but I'm not convinced on #2.

For the multiple app case, I'm guessing they already have multiple Roxy Deployer projects, one for each app. The problem is that the various projects are wrestling over control of the database. What if, instead, you set up a separate Roxy Deployer project that had sole responsibility for the content database's configuration? The downside is that someone who wants to set up just one of the applications will also need to get the database project, but it does consolidate control of that database into one place. That's probably better for governance and stability. Each of the other projects could remove the content database from ml-config.xml, which should prevent bootstrap from touching that database (might require some tweaks to setup.xqy to avoid errors).

The idea of multiple projects making changes to the content database config sounds risky. Likewise, if all projects drop responsibility for removing no-longer-used indexes, removing them would require manual intervention (again, error prone).

RobertSzkutak commented 8 years ago

I'm fine with your decision to implement 1) and 3). Presumably we could document a way to override app_specific and force 2) by default.

I'm absolutely pushing our customers toward doing exactly what you said with using one Roxy project for bootstrapping and others for app deployments. It works pretty well. However, imagine an ml-config.xml with 10+ app servers and ~100 indexes. Now imagine you're working on a project that needs only 2 app servers and 6 of those indexes. There's definitely a use here to maintain this info in an ml-config.xml for each project so that you can track which projects actually use which resources and so that you may optimize bootstrapping to your local environment. Now, there's nothing wrong with also maintaining a condensed ml-config.xml, until you are assigned to a second project and then you need to merge two custom, condensed ml-configs.

I agree that there will always be problems that can only be solved by better management/governance. For example, one project erroneously removing an index that another project needs. I do however feel strongly that adding in these features will inevitably prevent a disastrous scenario for someone.

dmcassel commented 8 years ago

app_specific is always available for such cases, as you noted. +1.

My suggestion for the database-controlling project is that it only controls the content database -- leave out the app servers, and keep them in the ml-configs of the various application projects. Yes, that means an app that needs 6 indexes will have a database with 100, but that's how it will be running in production. Regarding keeping track of which apps use which indexes; yeah, that's going to be tough. It can be done with comments, but only if whoever controls the project is good about updating them.

I don't know how common this is, but perhaps (in addition to the command-line argument), we can add a wiki describing best practices.

RobertSzkutak commented 8 years ago

Ah, okay, that makes a lot more sense. I completely agree about trying to get a wiki and/or a blog post describing best practices that's regularly revisited. Talking with a customer yesterday, they were surprised how many of their use cases we've already built that they didn't know about.

Maybe we can take this offline and prioritize and define some goals for my MBO this quarter to make sure time for this is allocated and completed.

heelix commented 7 years ago

And just as a side note (just noticed this thread) - the purging is not bad default behavior. One of the customers Rob is talking about is using this to purge out 'developer added' indexes. Part of this is routing the change requests to someone who is managing that master config file.