OrchardCMS / OrchardCore

Orchard Core is an open-source modular and multi-tenant application framework built with ASP.NET Core, and a content management system (CMS) built on top of that framework.
https://orchardcore.net
BSD 3-Clause "New" or "Revised" License
7.37k stars 2.38k forks source link

Site/Tenant startup fails when project has no wwwroot folder #3459

Closed deanmarcussen closed 5 years ago

deanmarcussen commented 5 years ago

Relatively minor issue, but if you follow the guide to create a CMS Web Application with Nuget (Beta3), with an empty web project, and create a Saas site, then create a tenant, the tenant startup fails if you don't have a wwwroot in the mvc project. oc-tenant

The error is from ImageSharp, and if you follow the instructions and create a wwwroot folder everything works fine.

Actually just checked, it happens when you create a blog recipe on main tenant as well (rather than a Sass recpipe)

Should we be providing ImageSharp the App_Data folder for it's cache so that cache's are separated tenant by tenant?

Or update the guide to say create a wwwroot folder?

Skrypt commented 5 years ago

This folder should be automatically created simply if it's not found. We could add something that would add this folder when a project is referencing OrchardCore.Application.Cms.Targets or also when the Media module is enabled. Because the issue is that ImageSharp requires having a /wwwroot/is-cache folder. It will create the is-cache folder only if the wwwroot folder exists else it will throw an error.

deanmarcussen commented 5 years ago

@Skrypt Yes, ImageSharp takes the env.WebRootFileProvider for it's PhysicalFileSystemCache. We could swap that out for something that uses the OC File Providers. It could handle restricting the cache size that you were after as well

Skrypt commented 5 years ago

Makes sense to me to have a different cache folder for each tenant.

With Azure Storage that could still be an issue since we don't want to have it in the Media folder at all. If you use Azure Storage then your Medias are the one listed from your Azure Storage Container which shouldn't include the is-cache folder since ImageSharp won't generate those files on the Azure Storage.

If we want to have the is-cache folder in the Media folder then we can set the Medias to be coming from Azure Storage and then add the is-cache folder as a local "special folder" OR add the is-cache folder in /App_Data/Site/sitename/is-cache and we add it to the media list as an option. But this will always be a local folder with local assets.

Also when I say Azure Storage I mean also any other CDN's

deanmarcussen commented 5 years ago

ImageSharp doesn't have a point specifically for cache location. But it does have an extensibility point for swapping the cache implementation - which is trivial.

I just had a look through my is-cache folder. Not exactly user friendly! So maybe not the best to expose it in the Media Folder - although it would provide an entry point for someone wanting to clear the cache.

/App_Data/Site/sitename/is-cache as a local folder independent from Blob Storage sounds good.

Skrypt commented 5 years ago

If I'm looking at this issue there : https://github.com/SixLabors/ImageSharp/issues/216 Swapping the image cache implementation will be available only when this PR will be merged. https://github.com/SixLabors/ImageSharp.Web/issues/5

deanmarcussen commented 5 years ago

@Skrypt I had a play with swapping PhysicalFileSystemCache out.

It swaps out with no problems.

Have you had a look at your is-cache folder size and contents?

I didn't realise this, but discovered that even if the media is not resized, if it passes through the IS pipeline, IS moves it into is-cache folder.

Did get it Image resizing working with a cdn though on #3468

Will wait for #3200 to come in and further feedback on a cache per tenant, before going too much further.

Skrypt commented 5 years ago

I need to try this locally to see if all my images are appended to my is-cache folder. So far, what I've seen is that I had only my resized images. Can you look if there's not some image files that you are trying to resize into a non allowed size ; that would maybe explain why it appends the full sized image in your is-cache folder since it can't resize it to the size you set it?

I'm downloading the is-cache folder from the Azure server by FTP it might take a while before I can tell you the size/content of it.

I noticed that the image cache sometimes tries to rebuild itself. And while doing so it needs to redo the resizing process on the images which can slow down the website experience. I tried once to go and click manually on all pages of my website to pregenerate all resized images in the cache but I need to make more tests to understand how that cache is working. Personnally I'd prefer that all the resized image be uploaded on my CDN and that it bypass that is-cache folder completely. I would keep it with a really low weight treshold just in case that the CDN upload is in process.

I also noticed that the image files in the is-cache folder are different than the original file names which makes it hard to match them against an actual Media Gallery image. So, where ImageSharp keeps a relation between the original file and the resized one? Does it require to rebuild and resize images everytime we restart the app pool? If so, then I think this is quite wrong.

deanmarcussen commented 5 years ago

I just tried this locally with a modified Agency theme - took out all the image resizing from the main template and deleted the is-cache folder. In that case it didn't try and resize them.

Tried again with my other site which uses the razor tag helper (and no resizing) and the unresized media all passed through the image resizer and ended up in the is-cache folder.

So going to have to debug that later and see what the differences are between using the fluid helper, and the razor helper - can't see how its helper related as it should be url based. So hopefully user error by me, and is some weirdness I introduced on my razor site.

Yes the naming is difficult to follow - I only noticed it initially because windows was showing thumbnails, and I was pretty sure that the images on my homepage should probably be resized as one of them is way bigger than it needs to be, but I didn't think I'd ever got around to including that in the template.

From reading the IS code it looks like it uses a hash of the url path (presumably including query string) as it's cache key, and then splits them down into the a/b/d/e/x/etc.

I've seen the rebuild of the cache occur when I deploy - no idea if it's app pool recycle or what that triggers it - I wrote a sitewarmer that ships the contents of the website of to an azure function every time I deploy, and then the azure function parses each page for media tags and hits them, to force them into the cache. But with a cdn in front of it that probably becomes unnecessary, as once they're in the cdn, they're there forever, or until it's purged.

Skrypt commented 5 years ago

I stopped re-downloading my is-cache folder yesterday. Through FTP it took around 6 hours and it was not finished. I had around 6 gb of files already and the media folder is around 8 gb. So, it's a bad sing that it must be adding original files in there too. Also found that my is-cache folder is making Visual Studio 2019 making a stack overflow exception because it tries to analyse each folders in that cache when I open the solution.

deanmarcussen commented 5 years ago

Ok, I checked again now with a modified agency theme. Definitely all media being served is being moved to the is-cache folder (didn't clear my browser cache this morning, clearly too early).

It must be running through the IS processor as well because the file sizes are different - an unresized jpg - the mountain picture from the agency theme about/4.jpg - is 4902 bytes in it's source media folder and when it becomes a204cf0a6ba1.jpg in the is-cache folder has risen to 5437 bytes.

Think it's happening because anything in the MediaFileStore gets picked up on by IS and 'handled' by the IS middleware. So regular StaticFile serving never gets an opportunity to handle it, even if it doesn't need resizing. There might be something we could do about that to handle it earlier. Not sure.

Regardless the is-cache folder clearly can grow too large!

Skrypt commented 5 years ago

I'm getting a lot of logs on Azure with the MediaFileProvider. It's probably related. The thing here is that on Azure the path for these files are different than E:\Repositories\ .... I need to find from where it takes this path.

2019-03-11 00:00:47.0798|Default|8000035c-0000-da00-b63f-84710c7967bb||Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware|ERROR|An unhandled exception has occurred while executing the request. System.ArgumentOutOfRangeException: The specified URL is not inside the URL scope of the file store.
Parameter name: publicUrl
   at OrchardCore.Media.Services.MediaFileStore.MapPublicUrlToPath(String publicUrl) in E:\Repositories\AffairesExtra - OrchardCore\src\OrchardCore.Modules\OrchardCore.Media\Services\MediaFileStore.cs:line 89
   at OrchardCore.Media.Processing.MediaFileProvider.GetAsync(HttpContext context) in E:\Repositories\AffairesExtra - OrchardCore\src\OrchardCore.Modules\OrchardCore.Media\Processing\MediaFileProvider.cs:line 49
   at SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Localization.RequestLocalizationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Builder.Extensions.MapWhenMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.Invoke(HttpContext context)    at OrchardCore.Media.Services.MediaFileStore.MapPublicUrlToPath(String publicUrl) in E:\Repositories\AffairesExtra - OrchardCore\src\OrchardCore.Modules\OrchardCore.Media\Services\MediaFileStore.cs:line 89
   at OrchardCore.Media.Processing.MediaFileProvider.GetAsync(HttpContext context) in E:\Repositories\AffairesExtra - OrchardCore\src\OrchardCore.Modules\OrchardCore.Media\Processing\MediaFileProvider.cs:line 49
   at SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Localization.RequestLocalizationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Builder.Extensions.MapWhenMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.Invoke(HttpContext context)

And here on localhost I'm getting these error logs

2019-04-16 01:22:46.0254|Default|80000003-0002-fd00-b63f-84710c7967bb||SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware|ERROR|The image 'https://localhost:44300/media/mediafields/MotorEquipment/4cn67rj8kfe2d0zptspnkp6n5b/8b5cc36344b4da96e3a6911a313bf697.jpg?&width=480&height=360&rmode=crop' could not be resolved 
2019-04-16 01:22:46.1884|Default|80000049-0008-fe00-b63f-84710c7967bb||SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware|ERROR|The image 'https://localhost:44300/media/mediafields/MotorEquipment/4dkx25wd4bkpbtgpzk0tanenm1/63f79f1d6421260ae41a887202c3012f.jpg?&width=480&height=360&rmode=crop' could not be resolved 
2019-04-16 01:22:46.1884|Default|80000028-0009-ff00-b63f-84710c7967bb||SixLabors.ImageSharp.Web.Middleware.ImageSharpMiddleware|ERROR|The image 'https://localhost:44300/media/mediafields/MotorEquipment/4stp1hvrsw85bz3mbrtezq7qw6/508f4c3de1930198955f8068a4976ec4.jpg?&width=480&height=360&rmode=crop' could not be resolved

But the strange part is that everything works. I can see those files without an issue.

deanmarcussen commented 5 years ago

Interesting. Are you still serving media locally? Not through blob storage?

The E:\Repositories path just refers to where Azure built the dll I think from memory. Is it building with kudo on Azure directly? Not with VSTS?

Skrypt commented 5 years ago

I'm publishing the app on my PC and deploying it to Azure with FTP. Old school way. The E:/Repository is the path where the source code stands on my PC. I'm not using Azure Blob Storage for now since we need to fix the is-cache issue before.

deanmarcussen commented 5 years ago

Proper old school!

Front the media with a cdn and clear your is-cache folder regularly?

Skrypt commented 5 years ago

Actually resizing images takes a lot of processing on Azure and it slows down page loading time. What I need is to bypass this is-cache and store resized images on a CDN permanently and use that CDN natural cache simply.

deanmarcussen commented 5 years ago

3468 is working for me.

Once the cdn is parked in front of it and the pages have been hit once the is-cache folder can be deleted and it's not even created again - until new media is added.

Email me if you want to chat infrastructure some more - this has gotten a bit off topic

Skrypt commented 5 years ago

True, I'm going to try your PR first.

sebastienros commented 5 years ago

We have a placeholder file in the package to ensure the folder is created. It should have worked.

deanmarcussen commented 5 years ago

Can repro at will. The placeholder doesn't get created. Lesson for me -> always test before a demo!

I looked - the only reference I can find to a placeholder file is your commit https://github.com/OrchardCMS/OrchardCore/commit/e230fdbc7af021f9999a93c350dbc5707f5170ab which forces .placeholder in the OrchardCore.Cms.Web project, which doesn't have an impact when referencing OrchardCore.Application.Cms.Targets from an empty web project (not sure if the template has changed with VS 2019 or not - perhaps the dot net core template used to have a wwwroot in it.)

If I do create a wwwroot folder all works fine, but there is still no .placeholder file created in it.

Not sure where the .nuspec is but could we not just include a .placeholder in the contentfiles in the package?

jtkech commented 5 years ago

@deanmarcussen @sebastienros

We have a placeholder file in the package

There is no placeholder in the package, right now you need to create yourself a wwwroot folder under your application, as done in our OC.Cms.Web application, and with a placeholder so that when publishing a wwwroot folder is created.

We would need to create a custom msbuild script executed on publishing and that auto generate this placeholder and add it to the publish outputs.

sebastienros commented 5 years ago

Our template should have it then. And also I thought there was a better exception message when it's not here.

deanmarcussen commented 5 years ago

How about just putting a placeholder in the nuget package contentFiles?

jtkech commented 5 years ago

@sebastienros @deanmarcussen

Need to be retried but as i remember now package content files are considered immutable, so they can be referenced but not copied under a project. Hmm, maybe just create the wwwroot folder at runtime.

I tried to create the folder (if WebRootPath == null) at runtime in the media startup, with a lock because tenants can be built concurently, seems to work. But i think here it is a host concern and would be better, in a new UseOrchardCms() helper, to init the WebRootPath if null and then the WebRootFileProvider which in this case has been setup to a NullFileProvider, as we do for the ContentRootFileProvider in our UseOrchardCore() helper to add our embedded file provider.

That said, in our cms app startup just did the following, then the host automatically create the folder, then it works. I think this is the right way because the folder is created early by the host itself which then 1st setup correctly its WebRootFileProvider. Do we need an helper just for this?

        WebHost.CreateDefaultBuilder<Startup>(args)
            .UseWebRoot("wwwroot")
            .UseNLogWeb();
deanmarcussen commented 5 years ago

@sebastienros @jtkech Nice UseWebRoot() looks sensible. That fixes the template, but doesn't fix the Use empty project Guide, which was what I was following when I stumbled across this. I still think it's just a minor bad user experience when following our docs/guides, so the tweak to the documentation for people that are using the beta3 NuGet's seems worthwhile.

Going forward - Package content files are immutable, but a .placeholder file will never change, so it is immutable. I could look at that if I knew where the .nuspec file was generated!

However I still think we have a bigger question.

Should the host be holding tenant related is-cache files? I think they should be held by the tenant, in the tenants App_Data folder. This would allow us to offer fine-grained tenant control of the size of that folder / or number of items in it.

Having discovered that IS processes all images that are served by it's pipeline, not just resized images, and moves them into it's is-cache folder, I think having control of that folder becomes a tenant concern, not a host concern - or maybe it's a bit of both. host needs to be able to set limits for tenants, but tenants should be able to reduce that limit perhaps?

Given that we start IS.Middleware on a tenant by tenant basis, it seems the cache should be tenant by tenant. Propose we swap the IS default PhysicalFileSystemCache.cs for a TenantPhysicalFileSystemCache.cs and implement a OrchardCore.Tenants.MediaCacheProvider based on the OrchardCore.Tenants.FileProvider which we give to the TenantPhysicalFileSystemCache.cs instead of env.WebRootFileProvider

This would also solve creating the directory re: tenant setup race conditions (I think)

And I think removes any dependency on the host file system

I've got about half this written, but would rather have general thoughts, and be aware of any other issues it might cause before putting a PR in

jtkech commented 5 years ago

@deanmarcussen

That fixes the template, but doesn't fix the Use empty project Guide.

Yes, we would need to update the doc to add UseWebRoot("wwwroot"), but not sure if this is what you meant. Anyway, not sure myself that this is the better fix (see below), only a 1st suggestion.

Package content files are immutable, but a .placeholder file will never change.

You're right, what i just meant is that this file will not be physically injected in your project.

Having discovered that IS processes all images that are served by it's pipeline, not just resized images.

I think i fixed it in my PR through the MediaFileProvider.IsValidRequest() that checks if there is at least one of the mandatory resizing parameters (width or height).

Control of that folder becomes a tenant concern, not a host concern - or maybe it's a bit of both.

A tenant is a kind of an isolated app but not yet an isolated host. What i mean is that beyond ISharp an host component needs to be correctly initialized at the app level and then we can compose with it knowing that if a shared host component is mutated in a tenant context we need a lock.

E.g we let the host initialize the ContentRootFileProvider then in UseOrchardCore() we can setup a composite still using this provider but also our ModuleEmbeddedFileProvider. Here it is still at the host level and just because the application has choosed to call UseOrchardCore().

So, among ISharp, we still need a WebRootFileProvider correctly setup by the host and then we can compose with it. There are pending PRs which do this, e.g for the verisioning tag helper using the WebRootFileProvider on which we add our custom provider.

When there is no wwwroot, WebRootPath is null and the WebRootFileProvider is a null file provider. That why i thought about UseWebRoot("wwwroot"), otherwise if we create the wwwroot later at the app level, or in a tenant context with a lock, we need to also setup this file provider.

But as said above, not sure that my 1st suggestion to use UseWebRoot("wwwroot") is the better fix. When we let the host creating the wwwroot it doesn't put a placeholder, so when publishing no wwwroot will be outputed. So, in a published context it may be recreated and it may depend on access rights.

So, to do it ourself at runtime, at the app level we would need a UseOrchardCms() helper, or at the tenant level by the 1st tenant being built and with a lock, but i don't like it.

So i think i will create a little msbuild task embedded in the cms package to do it.

Propose we swap the IS default PhysicalFileSystemCache.cs for a TenantPhysicalFileSystemCache.cs.

Otherwise, i agree with all your suggestions about ISharp. This one looks like a very good idea!

deanmarcussen commented 5 years ago

@jtkech thanks for the extensive reply

Yes, we would need to update the doc to add UseWebRoot("wwwroot"), but not sure if this is what you meant. Anyway, not sure myself that this is the better fix (see below), only a 1st suggestion.

Yes that's what I meant - see #3459 - not to add UseWebRoot("wwwroot"), but just to add the folder in VS - also not sure is the best approach.

Package content files are immutable, but a .placeholder file will never change.

You're right, what i just meant is that this file will not be physically injected in your project.

I thought that was the point of contentFiles? That NuGet would add them to your project. I can think of a couple of times where I've use it physically add them to the csproj file.

Having discovered that IS processes all images that are served by it's pipeline, not just resized images.

I think i fixed it in my PR through the MediaFileProvider.IsValidRequest() that checks if there is at least one of the mandatory resizing parameters (width or height).

Brilliant! @Skrypt Will be pleased too I imagine.

Control of that folder becomes a tenant concern, not a host concern - or maybe it's a bit of both.

Also thanks, that helped me understand the tenant / host process a lot better, and yes, we do really want a WebRootFileProvider to aggregate them all together.

So i think i will create a little msbuild task embedded in the cms package to do it.

My only question here is does this force the WebRoot to wwwroot if we're forcing creation during an MSBuild Task. If user has decided to go off the standard and make their own myspecialroot do we break things by creating another wwwroot alongside it?

Otherwise, i agree with all your suggestions about ISharp. This one looks like a very good idea!

Cheers it's coming along, should get some time on it in the next few days.

jtkech commented 5 years ago

@deanmarcussen

I thought that was the point of contentFiles? That NuGet would add them to your project

This is no more the case with the new .csproj system and PackageReference, maybe unless for some project types. At some point, before embedding view contents in assembly, we needed an msbuild task to copy them from their package locations to be compiled by the precompilation tool before publishing.

My only question here is does this force the WebRoot to wwwroot if we're forcing creation during an MSBuild Task. If user has decided to go off the standard and make their own myspecialroot do we break things by creating another wwwroot alongside it?

This is exactly what i said in one of my last comments in #3490. In fact each module has a wwwroot folder and at the tenant level we call UseStatiFiles() by passing a custom file provider with an hard coded wwwroot. Do you know that when using OC the application is a module itself so you can request an app level static file in a 2nd way by using in the requested path the application name (not often used).

That said, i don't think that using UseWebRoot("myspecialroot") is a problem, the tenant level file provider will still use the hard coded wwwroot to serve module static files, including the app's module by using the app name in the requested path.

Just tried the msbuild task embedded in the cms package, it works with 2 lines of code. Normally we would need to mark this file to be published, but fortunately the standard publishing script automatically publish the wwwrot folder.

deanmarcussen commented 5 years ago

@jtkech

This is no more the case with the new .csproj system and PackageReference, maybe unless for some project types. At some point, before embedding view contents in assembly, we needed an msbuild task to copy them from their package locations to be compiled by the precompilation tool before publishing.

I did not know that. Cheers.

the tenant level file provider will still use the hard coded wwwroot to serve module static files, including the app's module by using the app name in the requested path.

And I didn't know that either, which is something I was wondering how to do the other day, so again, cheers.

JoshTango commented 1 year ago

I just got this issue with a new install and using the Agency theme.