umbraco-community / UmbracoFileSystemProviders.Azure

:cloud: An Azure Blob Storage IFileSystem provider for Umbraco
96 stars 67 forks source link

Can this be used to offload existing media to azure? #12

Closed akeilox closed 8 years ago

akeilox commented 9 years ago

I havent used azure but exploring the viability of taking an existing umb install and move to azure.

First question comes to mind, if i can start by moving the media section to azure; the existing files. And what would be required to edit the tinymce content references; i.e. whether its a viable option to do that or not.

Would be great to hear your thoughts on this.

AussieInSeattle commented 9 years ago

I'd also like to know the steps for moving an existing site. Good work James, hope oz is treating you well.

JimBobSquarePants commented 9 years ago

Hi @akeilox

I think we could probably write something for the installer that would allow moving media files. Umbraco should be storing the path in the db as relative so theoretically we could grab those values and bulk upload them to Azure, could take a long time though based on your current media section.

Cheers @AussieInSeattle, yeah it's great here. Nice and summery!

@Jeavon What are your thoughts on this? Doable?

AussieInSeattle commented 9 years ago

A documented manual approach is all I was after - if you have a large amount of media, I would think that the installer might timeout or provide a bad UX as it chugs along not giving any feedback?

Seems the documented approach is simply the following for a live site:

Is that right?

Regards, Matt

JimBobSquarePants commented 9 years ago

Yeah, that looks like the correct approach to me.

You're probably correct - Automation could lead to trouble

Jeavon commented 9 years ago

I have thought about importing all existing files from the media folder to storage in the installer but the UI would be pretty complex. @AussieInSeattle process is the correct approach to transferring existing media

akeilox commented 9 years ago

For someone new to azure, can you please provide some guidance on how to go about the step 2 of the documented approach:

I have created an Azure account but not very sure how to go about this step. Would greatly appreciate some guidance.

paulbrown79 commented 9 years ago

@akeilox you can connect to your Azure storage account using some 3rd party tools. I use http://azurestorageexplorer.codeplex.com/. After you take a copy of your Umbraco media to your local machine, you can then use this tool to upload it to your storage account.

AussieInSeattle commented 9 years ago

@Jeavon Just watched you do this on uHangout #74 - one question though - at 12:15 into the uHangout you've pretty much done the above steps, but when you go to the media folder in /umbraco/ there are no images? Is that because all of the images were upload style images where they go into the media folder on disk but you can't access them via the media section of /umbraco/ ?

Jeavon commented 9 years ago

@AussieInSeattle that's just because the Fanoe starter kit doesn't create media items, it should. There is a bug for it here http://issues.umbraco.org/issueMobile/U4-5957

akeilox commented 9 years ago

@paulbrown79 thanks for the heads up! @Jeavon I watched the video and it was very helpfull. Using CloudXplorer uploaded a 5GB /media from an existing install.

Most images worked fine, till I realized some of the images were not loading. /media/1245/pen.jpg?width=100

In CloudXplorer it shows as /media/1245/Pen.jpg and on the file system as /media/1234/Pen.jpg

Umbraco (7.1.8 as we didnt move to grid yet) for some reason asking the /media/1245/pen.jpg

Do you guys have any clues on where to look for the issue/fix? For reference I uploaded the 0.5 package released today, gave a warning to update the ImageProcessor and re-install, which I did install latest and re-install - after all done followed the procedure above.

[update]: it seems most of the images are from an archetype, which contains 2 fields, Umbraco.MultipleMediaPicker (multi-select not selected, for single image only) and TextString property.

here is the code where it fetches the image from the archetype:

@{
            var theUrl="";
            var theTarget="";
            var multiUrlPicker =  fieldset.GetValue("link");
            if (multiUrlPicker.Any()){
                foreach (var item in multiUrlPicker){
                    theUrl=item.Url;
                    theTarget=item.Target;
                }
             }
}   
Jeavon commented 9 years ago

@akeilox blobs are case sensitive, have you tried /media/1245/Pen.jpg?width=100 (capital P)?

akeilox commented 9 years ago

@Jeavon yes the /media/1245/Pen.jpg is there and in the umbraco folder its indeed /media/1245/Pen.jpg

For some reason the getting the URL from Umbraco.MultipleMediaPicker it returns as /pen.jpg

[edited] please ignore the code above. following is the code used the fetch the image

if (multiUrlPicker.Any())
                              {
                                  foreach (var item in multiUrlPicker){   
                                <li>
                                 <a href="@item.Url" target="@item.Target">
                                 <img src="@(Umbraco.TypedMedia(fieldset.GetValue("image")).Url)?width=100&amp;height=100&amp;mode=crop" width="100%" alt=""/>
                                </a>
                                 </li>}
                              } 
Jeavon commented 9 years ago

Strange archetype seems to be dropping the casing somewhere. Probably wise to lookup the Url from the Media Item?

Something like this:

@{
            var theUrl="";
            var theTarget="";
            var multiUrlPicker =  fieldset.GetValue("link");
            if (multiUrlPicker.Any()){
                foreach (var item in multiUrlPicker){                                               
                    var mediaItem = Umbraco.TypedMedia(item.id);
                    theUrl = mediaItem.Url                      
                    theTarget=item.Target;
                }
             }
}   
akeilox commented 9 years ago

strange indeed. This was the code which was getting image list;

@foreach (var fieldset in Model.Content.GetPropertyValue<ArchetypeModel>("slideshow"))
                            {   
                              var multiUrlPicker =  fieldset.GetValue<MultiUrls>("link");   

                            if (multiUrlPicker.Any())
                              {

                                  foreach (var item in multiUrlPicker){   
                                <li>
                                 <a href="@item.Url" target="@item.Target">
                                 <img src="@(Umbraco.TypedMedia(fieldset.GetValue("image")).Url)?width=100&amp;height=100&amp;mode=crop" width="100%" alt=""/>
                                </a>
                                 </li>}
                              }

changed with this, still dropping the casing ;

@foreach (var fieldset in Model.Content.GetPropertyValue<ArchetypeModel>("slideshow"))
                            {       
                                var theimageUrl="";

                                var multiUrlPicker =  fieldset.GetValue<MultiUrls>("link");
                                var theImage  =       fieldset.GetValue("image");
                                if (multiUrlPicker.Any()){
                                    foreach (var item in multiUrlPicker){                                               
                                        var mediaItem = Umbraco.TypedMedia(theImage);
                                        theimageUrl = mediaItem.Url;                      

                                <li>
                                 <a href="@item.Url" target="@item.Target">
                                 <img src="@(theimageUrl)?width=100&amp;height=100&amp;mode=crop" width="100%" alt=""/>
                                </a>
                                 </li>

                                    }
                                 }  
Jeavon commented 9 years ago

@akeilox if you go to the media item in the Umbraco backoffice and click on the image to open it in a new tab does loose the casing?

akeilox commented 9 years ago

Here is another case of dropping case where Eye.jpg returns as eye.jpg, its not using archetype here:

var allNodesWithTags = CurrentPage.AncestorOrSelf().DescendantsOrSelf().Where("tags != \"\"");
foreach (var node in allNodesWithTags.OrderBy("date desc, UpdateDate desc"))
    {
< img class="hidden-phone" src="@(Umbraco.Media(node.image).Url)?width=240&height=180&mode=max" / >
}
akeilox commented 9 years ago

@Jeavon if i go to media and search for Eye the search result shows as "eye". I click on the media item and the title reads as "Eye" and when i click on it to open in new tab, its intact, not loosing the casing.

so the search result returning it as "eye" is an examine thing? perhaps we are on to something.

akeilox commented 9 years ago

Rebuilt both Internal and External index of Examine and it still returns "eye" when I search for "Eye", under Media section.

I am not sure where is the media indexed from. In all of the instances above, if i locate the image, click on it the Title is ok, when i click the image it opens fine. No case dropping in the media view page.

Its only when I call from code Umbraco.Media(node.image).Url it returns as eye.jpg

Jeavon commented 9 years ago

Really very strange, I need to test and see if I can replicate (what version of Umbraco are you using?). It might be wise for use to lowercase all blob names to avoid the issue, @JimBobSquarePants any thoughts on that?

akeilox commented 9 years ago

@Jeavon using 7.1.8

in Media section when searched for "Eye" the results returning as "eye" - would you know where it fetches the results? Clearly there is some sort of correlation, result not returning "Eye" but "eye" when the Media name is "Eye" and filename is "Eye.jpg"

Rebuilding Examine indexes didnt change this behaviour.

Jeavon commented 9 years ago

@akeilox how about trying the umbracoFile property

@{
    var allNodesWithTags = CurrentPage.AncestorOrSelf().DescendantsOrSelf().Where("tags != \"\"");

foreach (var node in allNodesWithTags.OrderBy("date desc, UpdateDate desc"))
    {
        <img class="hidden-phone" src="@(Umbraco.Media(node.image).umbracoFile)?width=240&height=180&mode=max" / >
    }
}
akeilox commented 9 years ago

@Jeavon just tried that and its giving the same eye.jpg

Media section -> media Title is "Eye", image shows, clicking on it shows the "Eye.jpg" Searching for Eye shows result as "eye" <== dont know where this index coming from

Umbraco.Media(node.image).umbracoFile Umbraco.Media(node.image).Url

both returns

eye.jpg

akeilox commented 9 years ago

@JimBobSquarePants @Jeavon

do you have any further insight on what might be causing the media name first character case being dropped? Specifically when a user search in Media section for "Eye" the result shows "eye" - but the Title of the image and the File name are both Eye.jpg.

This seems to be consistent with all media files first character case. I have tried to rebuilt all indexes but cant figure where the media search returns the result with dropped case on first letter.

would be great to hear your thoughts on this

AussieInSeattle commented 9 years ago

I believe the Examine Internal Index converts everything to lowercase as part of the analyzer being used doesn't it? If you name a file EyE.jpg - does it return eyE.jpg or eye.jpg? That would confirm it.

JimBobSquarePants commented 9 years ago

Hi there,

There's definitely nothing in this codebase that is changing the casing on save so it's either Umbraco doing it somewhere or something else third-party. In all my testing I've never seen this to be an issue when getting the crop url but I don't use Archetype, however I can't find any case changing code within the archetype code base.

akeilox commented 9 years ago

@AussieInSeattle Im no examine expert but I have rebuilt all indexes and it doesnt change the behaviour. Is there more to it beside rebuilding examine? like removing temp/cache files?

akeilox commented 9 years ago

@JimBobSquarePants look like an umbraco thing.

Its definitely not tied to archetype as umbraco returns the url with first case dropped in the following code:

@{
    var allNodesWithTags = CurrentPage.AncestorOrSelf().DescendantsOrSelf().Where("tags != \"\"");
foreach (var node in allNodesWithTags.OrderBy("date desc, UpdateDate desc"))
    {
        < img class="hidden-phone" src="@(Umbraco.Media(node.image).umbracoFile)?width=240&height=180&mode=max"  /   >
    }
}

Curiously this is the case for all images with the first letter is in capital, like Eye.jpg Pen.jpg Within Media section its all good but not the case when called from code.

Changing the url get method to any of these return the same result: Umbraco.Media(node.image).umbracoFile Umbraco.Media(node.image).Url

the src url gets /media/1234/eye.jpg when the media-section=> its Eye.jpg and when clicked it shows /media/1234/Eye.jpg

I am not sure where to look, having rebuilt all the examine indexes and uploaded all media to Azure. Media section works, upload to azure works, pages where it get the images to show like above are so far a mystery.

AussieInSeattle commented 9 years ago

Rebuilding the indexes wont change it from lowercase to "whatever you input case" - the type of analyzer used (I think) is what determines how the data is stored in Examine.

However, in the code you pasted above, I believe that hits the cache, not Examine.

Appears to be a bug in Umbraco with it changing it to lowercase - did you try naming your file EyE.jpg and see if it changes the second E to lowercase too?

And when you run the same code on a separate install without this package installed do you get the same behavior? This should help determining where the error is.

Regards, Matt

akeilox commented 9 years ago

Well this is strange... Created a new doctype with only a mediapicker, added @(Umbraco.Media(CurrentPage.file).umbracoFile)

to its template to show the bare url, with nothing else in the template.

I uploaded a new media with "ALLUPPERCASE.jpg" and it gets uploaded to azure as "alluppercase.jpg" and shows in the url (when clicked on image) as /media/2333/alluppercase.jpg

I disabled 301 url tracker, ezsearch, deleted all indexes, deleted umbraco.config and tried again and the above procedure always occurs.

Where else i can look for? (i'm on 7.1.8)

For the initial offload of media to Azure, I have uploaded all media with CloudXplorer to the /media blob - so thats uploaded as it is. The new ones uploaded with upper case gets all lower case it seems.. Getting more confusing but i think the above step is as simple as it gets with simple media picker and bare call to get media path.

Jeavon commented 9 years ago

@akeilox yes, I've tested with Umbraco v7.3.0 and it always saves the files with lowercase characters, I think this is good as it avoids any issues. However I'm pretty sure it hasn't always been this way and so you may have a mixture, has your site been upgraded from a older version?

akeilox commented 9 years ago

@Jeavon I have started with 7.1 and upgraded to 7.1.8 and stayed as still depend on tinymce(no-grid) and old contour with tested code. Currently on 7.1.8 for quite some time. Installed the Azure filesystemprovider on this system and all the test mentioned above carried on this version.

So in 7.1.8 it azurefilesystemprovider/umbraco saves the new media uploaded with lower case? Using CloudXplorer to offload the files manually, then continue using the media section as per normal will re-create this case for others?

As there are lots of content and images, captured in multi-url-picker to archetype etc., Im not sure what is the best way to go from here...

Would it be possible, say within the provider or in a code thrown inside the App_Code, add a logic where /media/ID/filename... requested it checks the ID and tries to fetch the requested filename case, and if fails defaults to normal?

JimBobSquarePants commented 9 years ago

@akeilox No... The filesystem provider doesn't alter the casing, I've said that already.

There's definitely nothing in this codebase that is changing the casing on save so it's either Umbraco doing it somewhere or something else third-party.

Umbraco does when it saves outside the interface. @Jeavon said this.

I've tested with Umbraco v7.3.0 and it always saves the files with lowercase characters

That change must have occurred somewhere between v7.1 and v7.1.8

The blob storage on Azure simply respects casing. If I were you I would write something that loops through you media and resaves it with the correct casing.

akeilox commented 9 years ago

@JimBobSquarePants thanks for the clarification.

I think i boiled it down to the following test case, on anoter 7.1.8 install with no special plugins;

in a news listing page, for each article getting image src via @(Umbraco.Media(node.image).Url)?width=240

1) if i delete the Internal examine index, refresh the listing page the image link shows correctly. 2) If I goto developer->examin tab, and click rebuild-index (just to be sure), it starts re-indexing... Once the re-indexing complete, i refresh the page and the image doesnt show (i.e. url case issue)

when this occurs i repeat point 1, refresh listing page and the images shows again.

Is there any way to configure to InternalIndex to not index the media url at the least?

I'm not well versed with umbraco internals nor examine, would be interested to hear your thoughts. Hopefully we can find the cause and document for others accordingly.

akeilox commented 8 years ago

Just to give an update, and also share my findings for others encountering similar issue;

I followed the first half of the uHangout video, used CloudXplorer to move existing media to the Azure blob container media (copy and paste).

I traced the initial installation, being 7.1.5, upgraded to 7.1.6 and finally 7.1.8. No plugins or event handlers getting in the way, cleared the 3rd party plugin possibility.

The issue with umbraco returning the media url with caps seems to be due to the Examine. As I observed whenever I delete all Examine index, immediately refresh page with medi, I get all the media shown up correctly (cases respected). Whenever the Examine finished indexing Umbraco was returning the url names in small caps.

This was the case when tried with; @(Umbraco.Media(node.image).Url) @(Umbraco.Media(node.image).umbracoFile) or TypedMedia().Url

if I changed one of the pages' partial to read media with the old api like Media file = new Media(mediaId); string url = file.getProperty("umbracoFile").Value.ToString();

then I get the correct cased media Url, whether examine index cleared, building or finished building.

Editing all the code such as old api was not applicable for my case, so I had to take a copy of the original /media folder and run a .bat script to iterate every file and turn to small caps. Then I copied (with overwrite) this to the Azure Blob with CloudXplorer.

Those files with upper case are still in azure blob, but this let me get along and move forward with the UmbracoFileSystemProviders.Azure installed and running.

losolio commented 8 years ago

Thansk @AussieInSeattle for the instructions. I used StorageExplorer to move the media, and can confirm it worked like a charm

JimBobSquarePants commented 8 years ago

I am going to go ahead and close this now. Thanks everyone for their input.