mattcrossmarc commented 5 years ago

I'm working on building an OrchardCore based web application that will be hosted in AWS on autoscaling containers where state is not preserved.

I understand the need to store data protection keys centrally and plan to use this provider to store them in AWS SSM:

https://aws.amazon.com/blogs/developer/aws-ssm-asp-net-core-data-protection-provider/

And, from the comment linked below, it sounds like it's possible to get to a stateless configuration for other types of data:

https://github.com/OrchardCMS/OrchardCore/issues/3381#issuecomment-477351856

Of the 3 requirements listed in the comment above, I understand how to do 1 & 2, but it's unclear to me how to cause OrchardCore to do 3.

1. Use a server database like SQL Server or PostgresQL instead of sqlite. I plan to use PostgreSQL in AWS RDS.

2. Use Azure blob storage for storing media. I plan to use AWS S3 for this by writing a custom media storage module like that done for Azure in OrchardCore.FileStorage.AzureBlob.

3. Using the database to store content types How is this enabled/configured? My apologies if this is in the documentation or source and I missed it.

Also, I see a couple of other things currently stored in App_Data and wwwroot that may need to be stored elsewhere. Can you please advise either way?

4. App_Data/tenants.json Can this be stored in the database or elsewhere?

5. wwwroot/is-cache I have seen references to this being a cache only that does not need to be shared across instances. Is that correct?

Thanks for your guidance.

Matt

deanmarcussen commented 5 years ago

Orchard Core will work well on AWS, I've had it running on EC2 instances for a POC.

However Distributed services for Orchard Core is still a work in progress by @jtkech . See https://github.com/OrchardCMS/OrchardCore/pull/2247 for some details on the wip, and what issues multiple instances have (namely caching redistribution, often for the admin)

S3 will work well as a abstracted media storage module, there is work towards making Azure Blob serve media from blob through Orchard Core, which will get resizing with ImageSharp working, and would also work with S3, refer https://github.com/OrchardCMS/OrchardCore/pull/4025
To use the database to store content type definitions disable the File Content Definition module. I use this on Azure with different slots. The trick if you already have content type definitions, is to do an export first, then disable, then import to the database.
I'm not sure about what is planned for tenants.json with distributed services. One thing that can be done with AWS is mounting a shared volume for App_Data, and pointing Orchard Core at it.
wwwroot/is-cache is a cache that does not need to be shared across instances. The same principles apply to the proposed Azure Blob Cache.

mattcrossmarc commented 5 years ago

Great, thank you @deanmarcussen. I'll work in those directions.

I hoped to avoid a shared volume for tenants.json and other data in App_Data since it would require separate backup and restore operations for recovery scenarios.

Does anyone know if there are plans to allow tenants.json to be stored outside App_Data? If not, I can definitely do work in this area, so any thoughts about how best to approach this would be appreciated.

sebastienros commented 5 years ago

My only concern is with tenants.json. If you had to create tenants dynamically that would be an issue. I think we should provide an implementation to support it before RC /cc @jtkech .

jtkech commented 5 years ago

The extension points are IShellConfigurationSources and IShellsSettingsSources. There are config sources using existing config providers but there are also mutable.

As i remember i did an experimental work to keep instances in sync with distributed signals / messages that was also holding some tenant transition states. That's ok to keep running instances in sync but anyway we will still need a distributed / shared store for these tenant configs / settings.

So, i think we will need custom implementations of the above services using custom config providers. I can plan to do some versions based on Redis in the distributed branch (as i did for data protection), then @deanmarcussen will be able to do some based on blob ;)

But there are many other places where tenants will need to be kept in sync, so for me the natural steps are to rework on the distributed services, maybe multiple PRs, the caching branch being one of them to be 1st merged (after reviewing the caching rules and how some race conditions are managed), then the custom implementations of the above services would be part of this incremental work.

But maybe better to wait if we can migrate to 3.0 soon because i have new conflicts every day ;)

sebastienros commented 5 years ago

I think we'll need an implementation that will make use of a distributed event broker (SQL Server or Redis). How ready is your branch? We haven't looked at it in a year I assume ;)

jtkech commented 5 years ago

Yes, by using a message bus and we opted to do a 1st concrete implementation based on Redis.

Last commit of this branch is on 7 dec 2018.

I remember the general concepts but i forgot all the details ;)

Update: i think OCC will be ready in 2025 ;)

sebastienros commented 5 years ago

I assume we need to store the properties in the database (same json structure/document with yessql) and then send a notification on other nodes to load the tenant. Let's get a quick look at it together if you want. And I am sure @deanmarcussen would love to see that too.

jtkech commented 5 years ago

Yes, this is okay for the shared store and for the notification this is already implemented in the distributed branch through a message bus (but based on redis).

jtkech commented 5 years ago

Let's get a quick look at it together if you want

Ah okay when you want but after all this time not sure i'm ready for the details.

sebastienros commented 5 years ago

We have to fight Alzheimer before it's too late

jtkech commented 5 years ago

😄

mattcrossmarc commented 5 years ago

Thanks for considering picking this up. And, thanks for the pointers to the existing interfaces @jtkech.

deanmarcussen commented 5 years ago

So Distributed before OCC @jtkech :)

Some thoughts

It could be interesting for ShellsSettingsSources to potentially become accessable via IShellConfiguration. This is newish from Azure https://docs.microsoft.com/en-us/azure/azure-app-configuration/overview and if tenants.json was accessed through IConfiguration then it would have good potential for Azure. Still the message broker to reset tenants of course...
Putting all the admin pages through the url /admin which I think is being looked at on https://github.com/OrchardCMS/OrchardCore/pull/4086 is useful for load balanced environments. It makes it easy on AWS to have the load balancer direct all admin requests to one server. Useful for security, or having the admin page live on a seperate url, or out of the load balanced environment completely. Also very useful during deployments to multiple environments to be able to remove admin from the equation (essentially pausing content updates while different servers code bases are out of sync). It's more dificult to do this with Azure, as the environments are more abstracted, but there are ways, and it's useful.
Blob yes maybe :), but IConfiguration isn't async and so far aspnetcore keep closing the issue to make it async. Redis all the way for Distributed!
Lastly re comments on a shared store in database. Which is a lot more abstracted from a specific Azure solution. Would we consider making ShellsSettingsSources for tenants through a database store, knowing that we need a connection string for the Default tenant at least, but once we have that, then the other tenant information could be readable from the Default tenant database. Noting that Async is still an issue with IShellConfiguration at least.

Definite support from me on this :)

mattcrossmarc commented 5 years ago

Thanks again for your help with this.

Azure App Configuration sounds like it serves a similar purpose to AWS Parameter Store. I frequently use Parameter Store to centralize storage and management of shared settings and secrets across stateless and serverless architectures. Last night, I successfully wired up Parameter Store for both data protection keys and app settings (via IConfiguration) with OrchardCore.

I mention all of that because it may be that providers for both Azure App Configuration and AWS Parameter Store could be implemented behind a common settings interface that enable storing tenants.json outside of App_Data without the need for Redis. Admittedly, I am very new to OrchardCore, so I'm not sure about all of the implications of relocating tenants.json. It sounds like one important factor is notification across the architecture when it changes.

I definitely agree about Redis being a reasonable fit. But, just in terms of striving for a more serverless approach and to avoid higher fixed costs associated with hosting Redis, maybe the interfaces for sync'ing could be abstract enough to allow a lower cost option. For example, simple webhooks could potentially be used together with low cost, high scale pub/sub services like Azure Service Bus and AWS SNS to notify interested parties that tenants.json (and other settings) have changed.

Apologies if I'm missing some important requirements here. Just wanted to share thoughts from an interested party. :)

sebastienros commented 5 years ago

@mattcrossmarc a common store for tenants.json can be anything, but when we mention redis here is to be able to get all nodes to be aware that the config has changed. Unless there is a way to push notifications when settings have changed?

mattcrossmarc commented 5 years ago

Thanks @sebastienros. That's good to hear.

Regarding pushing notifications, I would really like to be able to use an AWS SNS topic to which all web app instances subscribe. This could be implemented as an optional OrchardCore module that somehow gets a local, in-app event notification (maybe just a callback I receive from OrchardCore) when tenants.json changes, or more generally when configuration state changes that all instances need to know about.

Inside this custom module, I would also expose an HTTPS endpoint that AWS SNS would call anytime new messages are posted to the topic. That way, all app instances can know to reload tenants.json. Then, all I would need is some way to tell OrchardCore to reload tenants.json from the backend store (whatever that is in each case).

Let me know if I am making sense on this or you want more details.

mattcrossmarc commented 5 years ago

Just wanted to report back on progress (note the numbering change relative to my original list since I added Data Protection as a numbered item as I should have done before):

1. Data Protection in AWS Parameter Store I created a custom OrchardCore module for this, and it is working well. The key logic in this module is as follows:

        public override void ConfigureServices(IServiceCollection services)
        {
            var parameterNamePrefix = _configuration.GetValue<string>(
                "Modules:" + Settings.ModuleName + ":" + Settings.ParameterNamePrefixSetting,
                _defaultParameterNamePrefix);

            if (!string.IsNullOrWhiteSpace(parameterNamePrefix))
            {
                // See https://aws.amazon.com/blogs/developer/aws-ssm-asp-net-core-data-protection-provider/
                services.AddDataProtection().PersistKeysToAWSSystemsManager(parameterNamePrefix);
            }
            else
            {
                _logger.LogCritical($"No parameter name prefix was supplied for {Settings.ModuleName}. Ensure that an application setting containing a valid parameter name prefix is available at 'Modules:{Settings.ModuleName}:{Settings.ParameterNamePrefixSetting}'.");
            }
        }

Also, I set the module priority as was done in the Azure Data Protection module:

public override int Order => 10;

The only issue I had was that the IShellConfiguration instance I receive in the module did not provide access to the configuration information about the module (even without the Modules prefix on the setting name). Maybe I was doing something wrong here. To work around this, I just injected a standard .NET Core IConfiguration to get module configuration from appsettings.json. Let me know what I'm doing wrong if you see it.

When I create a site using a new theme with this module enabled, keys are correctly stored in AWS Parameter Store. Note that a key is created for the Default site when provisioning the site from scratch, but if that is deleted, it is not recreated when the site is restarted as would typically be the case.

2. Use a server database like SQL Server or PostgresQL instead of sqlite. Done during initial site configuration. Just works!

3. Use Azure blob storage for storing media. Not yet started. I plan to use AWS S3 for this by writing a custom media storage module like that done for Azure in OrchardCore.FileStorage.AzureBlob.

4. Using the database to store content types I did this in my new theme module by commenting out this module in my recipe file (thanks @jtkech):

//"OrchardCore.Contents.FileContentDefinition",

5. App_Data/tenants.json Currently in discussion above.

6. wwwroot/is-cache No work needed. Just a cache. Thanks @deanmarcussen.

jtkech commented 5 years ago

@deanmarcussen thanks in advance

interesting for ShellsSettingsSources to become accessable via IShellConfiguration ...

Yes, this was the case at some point but we separate it from other config files for perf reasons by providing the minimum set of settings to run a tenant when requested. We have to remember this if coming from another custom source (maybe still a local cache).

But for infos the settings reading part uses the same IConfiguration stack meaning that we can easily do an implementation using a configuration provider coming from an external library. Can't be provided by a module / feature but e.g by an app level helper (e.g .AddSettingsSources()).

Will see, maybe an idea around the default tenant that always exists, i tried something like that in the distributed branch. Or a way to specify through IConfiguration (not IShellConfiguration), a custom settings provider (and its options), as we add tag helpers through _ViewImports, easy to say ;)

Then, for the mutable part. Normally a config / setttings source is not mutable, okay but at some point we want to be able to change it with a kind of editor (e.g notepad for a json file). Can be integrated in OC, that's what we do when editing a given tenant or at the end of its setup.

Or can be done by an external and more complex centralized tool, in that case the IShellsSettingsSources would have to do nothing on Save(), each instance would have to subscribe to an event if provided by the library, or we would have to reload the tenant e.g through the admin that will propagate this event to other instances (tenant events based on Redis already experimented).

but IConfiguration isn't async

Maybe not a so big concern because config sources are intended to be load once on startup.

on a shared store in database. ..., we need a connection string for the Default tenant at least

Yes, maybe we could use the above idea around the default tenant, and in that case the requirement would be that the database settings would need to be pre-configured (e.g appsettings.json). Hmm, IShellSettingsManager can't inject ShellHost (that uses it) and then can't create a shell context.

@mattcrossmarc no problem, happy that i could help you.

the IShellConfiguration instance I receive in the module did not provide access to the configuration information about the module (even without the Modules prefix on the setting name).

Yes, you don't need the Modules section, then, without having to specify it, you need at least in your json a main section named OrchardCore, then an optional sub section whose name is a given tenant name, then your module child section. If your module section is directly under the main section, config values are applied to all tenants, otherwise only for the related tenant.

Many base services and default implementations are already defined, maybe we need more extension points at the app level to configure custom settings sources. Then we will be able to think about any other possible custom implementations at the tenant level (provided by modules) or at the app level.

Need to be done step by step and based on some minimum fondations. That's what i began in #4001 by defining some caching rules, e.g update a local cache only when re-reading data from the store (shared or not), not when storing. I updated most of the tenant level caches but i think we will need to do the same kind of things for our mutable config / settings sources.

@sebastienros, before reworking on any distributed / stateless things, i would prefer to wait for #4001 and other adaptations just after as said above. So that when reworking on the distributed branch we will add new things most of the time without having to update many existing implementations.

Nothing urgent for me, so we can wait the 3.0 migration if it is better. But if you want such things before 3.0, because i feel that here we need some chronologic steps, maybe as a 1st step we could merge #4001 (after reviewing), maybe not so much more conflicts that i already have every day in the 3.0 banch.

mattcrossmarc commented 5 years ago

Another update on my side:

1. Data Protection in AWS Parameter Store Thanks @jtkech. After updating my appsettings to use a parent OrchardCore key, I was able to read in the module configurations using IShellConfiguration. Here's an example:

{
  "OrchardCore": {
    "Caper.DataProtection.AWS.SSM": {
      "ParameterNamePrefix": "caper/dev1"
    }
  }
}

3. Use AWS S3 for storing media I've successfully implemented an S3FileStore and an IMediaFileStore on top as a module. Functionally, it's working well for all CRUD operations, including provisioning of initial site media from a theme.

One issue I'm still trying to sort out is that most file operations get called twice by OrchardCore. For example, when using the Assets page in Admin, listing the contents of a directory will result in two calls with the same parameters to the file store. Are you seeing the same thing? Maybe I'm wiring up the IMediaFileStore incorrectly, but I'm using the same approach as is used in the AzureBlob media store (i.e., services.Replace).

By the way, if you want me to PR these modules back in to OrchardCore, I'm glad to do that. No worries either way and glad to contribute back if you think others might be interested in using the modules.

jtkech commented 5 years ago

@deanmarcussen is best placed to give you more infos, it has a pending PR where azure media has been refactored (e.g where a blob image can be resized by ISharp) and, i think, that will be merged soon.

Also, we will migrate to netcore 3.0 soon.

mattcrossmarc commented 5 years ago

Thanks for letting me know about both upcoming changes.

deanmarcussen commented 5 years ago

@mattcrossmarc Sounds like you're getting there with S3.

For info the PR changing how azure media works is #4025, which hopefully will go in soon, after core3, so you can take a look there to see the changes to the IMediaFileStore. Not too many to the store itself, but there is an addition of a method which reduces some of the head requests that can be made when reading a file. There are a lot of other changes where we have the local file system caching files, and serving directly to enable resizing with ImageSharp, so perhaps have a read of the PR comments, to see how they will impact you.

Regarding file operations and multiple calls.

The media app generally makes a call to get a directory listing and another call to get a file listing of the current directory. So maybe that is what you mean by multiple calls? I've not seen it make multiple calls for no need, but generally Azure Blob will make a few head requests, because of the inherit design of the underlying library, (basically verifying file existence before making operations), and while I'm less familiar with the S3 library I do use it from time to time, and I think the design is similar.

So for example when you check if a file exists, and then (moments later) try to download it, you will have two head requests, and then a stream will open. That is just how the underlying libraries work.

If you want to, put your S3 file store somewhere on github, and I can make time to have a look and help you out if you need it.

carlwoodhouse commented 5 years ago

Just wanted to +1 the idea of stateless tenant.json :)

MatthijsKrempel commented 4 years ago

@jtkech Can I offer any help?

jtkech commented 4 years ago

@MatthijsKrempel okay, i will let you know, thanks

sebastienros commented 4 years ago

I think we are just missing a way to store tenants in the db (or in blob storage). @MatthijsKrempel suggest in the default tenant database.

jtkech commented 4 years ago

I'm a little blocked on the distributed branch, there are new options that i have to discuss with @sebastienros which is too busy, so meanwhile i will re-think about stateless tenants.

So, we have 2 kind of mutable config / settings sources that are updated on edit / save / setup, tenants.json for the min set of shell settings, and all appsettings.json under tenant folders.
Do you know that all can be pre-configured? So, if you don't need to create tenants dynamically, stateless tenants is already there, at least partially. I tried to put in the root appsettings.json under the OrchardCore section a global ConnectionString, DatabaseProvider and for each tenant a State = Running, and if needed a RequestUrlPrefix and a TablePrefix. Then i could remove tenants.json and all tenant specific appsettings.json, and it was working.
Hmm, to dyn create tenants, one strategy would be to check from the main configuration if the default tenant is fully pre-configured. If so (and maybe with another config option) we could load / save settings / configs of any other tenant from the db, this by using the default tenant.

deanmarcussen commented 4 years ago

@jtkech funny, I am a little blocked as well, so I made a start on something here yesterday at lunchtime.

Because the ShellsSettingsSources and ShellsConfigurationSources are loaded in the DI at Application Level, and need to be mutable for both the tenants.json and appsettings.json it made more sense to me to follow you original suggestion and create a blob store for them.

Storing in the database in the default tenant made less sense, because the default tenant may or may not have been setup, and how would you load its configuration, if its shell has not been loaded (not so hard via just IConfiguration instead of IShellConfiguration), but also its ISession store, which is registered at Shell level, rather than application level. Or then maybe you have to hack another ISession in at application level (maybe resolvable from the default tenant scope, but only if it's shell had been built, and then it felt like it was getting circular. i.e. we need the default shell, but the default shell needs configuration etc?)

So what I did so far, to see if it would work, was make a BlobShellsFileStore, and replace the ShellsSettingsSources with a BlobShellsSettingsSources

So now I have tenants.json running of Blob Storage, and next would be to also implement the same for each tenants appsettings.json (and the overall application appsettings.json)

This way those files are also available to edit from the private blob container, much as if they were in App_Data

I did also have a look to see if Microsoft.Extensions.Configuration.AzureAppConfiguration or KeyVault could be useful in this scenario.

They are useful for centralized configuration, but they have no mutation, which we need for tenants.json and appsettings.json so while anyone could use them with no changes to OrchardCore (they are just a dropin at Program.cs level) we can't make use of them for mutating as we do with live tenant creation.

jtkech commented 4 years ago

@deanmarcussen yes, you understood everything ;)

Your blob solution looks good. I think you also meant we would be able to edit / save from the admin / after a setup. Maybe we would not need to split the settings of all shells from their tenant specific config values, maybe still good for perf, will see. I think we don't need to care about the the overall application appsettings.json and any other regular config sources, we would have just an additional and mutable one where we can put what we want.
About the database solution it was a first draft in my mind, finally i think we don't need that the default tenant is fully pre-configured. From the main config we only need e.g a kind of options to validate it and to provide what we need to connect to the DB, as we would need for the blob.

Then we could use a kind of (or extend the existing) MinimumShellDescriptor and a scope on it to access the database, as we already do to check if there is a currentDescriptor when we try to build a shell, see in ShellContextFactory. So we could check if a currentConfiguration exists and then use it OR e.g run a setup that will save the settings in the database and so on.
Finally, by implementing the same interfaces, i think that both solutions are quite similar, only the details of loading / storing will be different. The advantages i see with the database solution is to have a more general solution, easier to configure, at least for me ;) and maybe because of the transaction capability.

Would be good to work on it together and maybe also on my others shells and distributed branches where there will be some other related things on which we will need to be in sync. Need to leave, i will think about it this night, if i don't fall asleep in front of my PC, as i did these last nights ;)

deanmarcussen commented 4 years ago

Yes definitely @jtkech let's get together on skype and figure it out.

I will try and push something tonight, just for starters, so you can see where it is at :)

Your blob solution looks good. I think you also meant we would be able to edit / save from the admin / after a setup.

I was thinking less about editing from the admin, and more just editing directly on blob storage, as a json file, as you would do with App_Data already. However maybe we could also expose something in the admin as well.

Maybe we would not need to split the settings of all shells from their tenant specific config values, maybe still good for perf, will see. I think we don't need to care about the the overall application appsettings.json and any other regular config sources.

I figured because it is about load balancing that probably we should do all the sources, because otherwise someone will make a config on one server in the App_Data and wonder why it isn't reflected on the other server.

For now (because it was easy to cut and paste) I have left it with the files split across tenants. Maybe not the best for Startup performance with multiple tenants, so we will see.

About the database solution it was a first draft in my mind, finally i think we don't need that the default tenant is fully pre-configured. From the main config we only need e.g a kind of options to validate it and to provide what we need to connect to the DB, as we would need for the blob. Then we could use a kind of (or extend the existing) MinimumShellDescriptor and a scope on it to access the database, as we already do to check if there is a currentDescriptor when we try to build a shell, see in ShellContextFactory. So we could check if a currentConfiguration exists and then use it OR e.g run a setup that will save the settings in the database and so on.

Finally, by implementing the same interfaces, i think that both solutions are quite similar, only the details of loading / storing will be different. The advantages i see with the database solution is to have a more general solution, easier to configure, at least for me ;) and maybe because of the transaction capability.

Yes, it could go either way here. I tinkered with blob because it was easy for me ;)

Would be good to work on it together and maybe also on my others shells and distributed branches where there will be some other related things on which we will need to be in sync. Need to leave, i will think about it this night, if i don't fall asleep in front of my PC, as i did these last nights ;)

Yes, so many things, I message you tonight if I am awake too ;)

jasonchester commented 4 years ago

1893 seems like a related ticket that can also be closed

dalenewman commented 3 years ago

@mattcrossmarc have you open sourced any of your modules related to AWS services?

mattcrossmarc commented 3 years ago

@dalenewman I haven't, but I'm glad to share them. Let me get the relevant code moved into a repo that can be made public.

mattcrossmarc commented 3 years ago

@dalenewman My apologies for the delay. Here is a repo with the AWS S3 media module, updated for OrchardCore 1.0.0:

https://github.com/crossmarc/Crossmarc.Media.AWS.S3

OrchardCMS / OrchardCore

Guidance on stateless configuration #4107

1893 seems like a related ticket that can also be closed