Saved images are always re-compressed #36

chrisallen commented 12 years ago

As seen here:

Every fanart / poster etc. which is scraped will always be recompressed. It's significantly smaller and with noticable artifacts.

I've located the problem and it's in "clsAPIImages.vb" (EmberAPI). The function which creates the new JPG is Public Sub Save(ByVal sPath As String, Optional ByVal iQuality As Long = 0). The quality of the cached images can be altered by iQuality but still always recompresses the image. It can be fixed by recreate the save routines. To scrape all fanarts / posters the function DownloadFile(downloadLink, savePath) can be used to download them and FileCopy(SourceFile, DestinationFile) to copy the selected files from cache to the movie folder. So nothing is recompressed.

msavazzi commented 11 years ago

good idea! But is not trivial, if you read the code the images are stored in memory not on files. When they are in memory they are "uncompressed" on download. The procedure should be changed to download & save images, store the filename, load image from file and then... if the image is selected copy the file and delete the others.

Quite a change.

Other option is to download twice. The first time to show them and then, when an image is selected instead of saving the one in memory it should be downloaded to disk.

I think the chosen option is the best from a performance perspective AND never leaves garbage around.

Additional point is that the current procedure does an implicit conversion from any possible format supported by the Image object (PNG, BMP, GIF, etc) to JPG

Have you tried to raise the quality?

chrisallen commented 11 years ago

I have tired raising the quality in the GUI but no matter what is selected, it is still compressing the JPG's and introducing artifacts. I suggest your second option, where by it downloads the image twice. once to preview and then again to save it to disk if selected.

Cocotus commented 11 years ago

Good ideas guys, I also don't like the fact that Ember recompresses before saving the file, so I added some few modification to download the image again when I click the ok Button and selected an image. I will test it a bit and tell you how it work. It's possible that it requires a bit modification. So far the modified code:

Public Class ScrapeImages, edit Save Method

Public Shared Sub Save(ByVal _image As Image, ByVal sPath As String, Optional ByVal iQuality As Long = 0, Optional ByVal sUrl As String = "") Try If IsNothing(_image) Then Exit Sub

        Dim doesExist As Boolean = File.Exists(sPath)
        Dim fAtt As New FileAttributes
        If Not String.IsNullOrEmpty(sPath) AndAlso (Not doesExist OrElse (Not CBool(File.GetAttributes(sPath) And FileAttributes.ReadOnly))) Then
            If doesExist Then
                'get the current attributes to set them back after writing
                fAtt = File.GetAttributes(sPath)
                'set attributes to none for writing
                File.SetAttributes(sPath, FileAttributes.Normal)

                    If Not sUrl = "" Then
                        Dim stroriginalurl As String = sUrl

                        'Image Download from tmdb is special, need original size
                        If Not sUrl.Contains("impawards") AndAlso Not sUrl.Contains("movieposterdb") Then
                            'Always get original image...
                            'links to images (tmdb) have following structure:  'example:
                            Dim temp As String = sUrl
                            Dim stringArray() As String = Split(temp, "/")
                            If stringArray.Length > 4 Then
                                ' stringArray(5) contains values like "w185","original", "w154"...-->size -> we want original!
                                stroriginalurl = sUrl.Replace(stringArray(5), "original")
                            End If
                        End If
                        Dim webclient As New Net.WebClient
                        webclient.DownloadFile(stroriginalurl, sPath)
                        If doesExist Then File.SetAttributes(sPath, fAtt)
                        Exit Sub
                    End If
                Catch ex As Exception
                    Master.eLog.WriteToErrorLog(ex.Message, ex.StackTrace, "Error")
                End Try

            End If

            Using msSave As New MemoryStream
                Dim retSave() As Byte
                Dim ICI As ImageCodecInfo = GetEncoderInfo(ImageFormat.Jpeg)
                Dim EncPars As EncoderParameters = New EncoderParameters(If(iQuality > 0, 2, 1))

                EncPars.Param(0) = New EncoderParameter(Encoder.RenderMethod, EncoderValue.RenderNonProgressive)

                If iQuality > 0 Then
                    EncPars.Param(1) = New EncoderParameter(Encoder.Quality, iQuality)
                End If

                _image.Save(msSave, ICI, EncPars)

                retSave = msSave.ToArray

                Using fs As New FileStream(sPath, FileMode.Create, FileAccess.Write)
                    fs.Write(retSave, 0, retSave.Length)
                End Using
            End Using

            If doesExist Then File.SetAttributes(sPath, fAtt)
        End If
    Catch ex As Exception
        Master.eLog.WriteToErrorLog(ex.Message, ex.StackTrace, "Error")
    End Try
End Sub

Public Class Images, edit Save Method

Public Sub Save(ByVal sPath As String, Optional ByVal iQuality As Long = 0, Optional ByVal sUrl As String = "") Try If IsNothing(_image) Then Exit Sub

        Dim doesExist As Boolean = File.Exists(sPath)
        Dim fAtt As New FileAttributes
        Dim fAttWritable As Boolean = True

        If Not String.IsNullOrEmpty(sPath) AndAlso (Not doesExist OrElse (Not CBool(File.GetAttributes(sPath) And FileAttributes.ReadOnly))) Then
            If doesExist Then
                'get the current attributes to set them back after writing
                fAtt = File.GetAttributes(sPath)
                'set attributes to none for writing
                    File.SetAttributes(sPath, FileAttributes.Normal)
                Catch ex As Exception
                    fAttWritable = False
                End Try
            End If

                If Not sUrl = "" Then
                    'TODO V3 API implementation to get ALL posters!
                    '  GetsImagesFromTMDBv3("URL/MOVIEDID")

                    Dim stroriginalurl As String = sUrl
                    'Image Download from tmdb is special, need original size
                    If Not sUrl.Contains("impawards") AndAlso Not sUrl.Contains("movieposterdb") Then
                        'Always get original image...
                        'links to images (tmdb) have following structure:  'example:
                        Dim temp As String = sUrl
                        Dim stringArray() As String = Split(temp, "/")
                        If stringArray.Length > 4 Then
                            ' stringArray(5) contains values like "w185","original", "w154"...-->size -> we want original!
                            stroriginalurl = sUrl.Replace(stringArray(5), "original")
                        End If
                    End If

                    Dim webclient As New Net.WebClient
                    'Download image!
                    webclient.DownloadFile(stroriginalurl, sPath)

                    If doesExist And fAttWritable Then File.SetAttributes(sPath, fAtt)
                    Exit Sub

                End If
            Catch ex As Exception
                Master.eLog.WriteToErrorLog(ex.Message, ex.StackTrace, "Error")
            End Try

            Using msSave As New MemoryStream
                Dim retSave() As Byte
                Dim ICI As ImageCodecInfo = GetEncoderInfo(ImageFormat.Jpeg)
                Dim EncPars As EncoderParameters = New EncoderParameters(If(iQuality > 0, 2, 1))

                EncPars.Param(0) = New EncoderParameter(Encoder.RenderMethod, EncoderValue.RenderNonProgressive)

                If iQuality > 0 Then
                    EncPars.Param(1) = New EncoderParameter(Encoder.Quality, iQuality)
                End If

                _image.Save(msSave, ICI, EncPars)

                retSave = msSave.ToArray

                'make sure directory exists
                If sPath.Length <= 260 Then
                    Using fs As New FileStream(sPath, FileMode.Create, FileAccess.Write)
                        fs.Write(retSave, 0, retSave.Length)
                    End Using
                End If
            End Using

            If doesExist And fAttWritable Then File.SetAttributes(sPath, fAtt)
        End If
    Catch ex As Exception
        Master.eLog.WriteToErrorLog(ex.Message, ex.StackTrace, "Error")
    End Try
End Sub         

Public Class dlgImgSelect

Add in declaration: Private selURL As String = ""

edit Private Sub DoSelect, Add under Me.selIndex = iIndex: Me.selURL = poster.URL

edit Private Sub OK_Button_Click, edit Save 2xtimes in this Sub: Me.tmpImage.Save(tmpPathPlus, 100, selURL)

Cocotus commented 11 years ago

Another problem is, that someone should definately rewrite the TMDB API-Calls in Ember. It uses an depracated API which sometimes delivers wrong results! (Not all possible posters are displayed in Ember - definately not cool)

I asked here:

Cocotus commented 11 years ago

Great msavazzi! BTW, the nocompressing poster and fanart I coded (code above) seems to work, so far no problems during scraping at least 30 movies.

Oh and I just found an explanation to sporadic crashes of Ember . Stacktrace of crash:

bei System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy) bei generic.EmberCore.WebServer.clsServer.HandleConnection() in D:\Dropbox\BACKUP\Modding\Programmierung\bodrick-Ember-MM-be61071\cocotus-Ember-MM\Addons\generic.EmberCore.WebServer\Server\clsServer.vb:Zeile 71. bei System.Threading.ThreadHelper.ThreadStart_Context(Object state) bei System.Threading.ExecutionContext.runTryCode(Object userData) bei System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData) bei System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) bei System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) bei System.Threading.ThreadHelper.ThreadStart()

It's this line of code in Addons\generic.EmberCore.WebServer\Server\clsServer.vb , Line 71:

iStartPos = sRequest.LastIndexOf("/") + 1 sRequestedFile = sRequest.Substring(iStartPos)

This code will crash Ember If Not found (-1)... happenend to me twice!

So I replaced it with: If (iStartPos > 0) Then sRequestedFile = sRequest.Substring(iStartPos) Else sRequestedFile = sRequest End If

So far no problems anymore :)

Cocotus commented 11 years ago

Hey no problem, btw in earlier post I wrote:

If (iStartPos > 0) Then sRequestedFile = sRequest.Substring(iStartPos) Else sRequestedFile = sRequest End If

As the fix - but I think it's better to leave the funciton right away, cause going further with "wrong" sRequestedFile" seems to produce other problems. So I modified it to:

If (iStartPos > 0) Then sRequestedFile = sRequest.Substring(iStartPos) Else mySocket.Close() Return End If


Cocotus commented 11 years ago

"’ve added you changes but I think they are partial as the SaveFanart, SavePoster and the other saves are still using the quality. Can you check what happens with different options?"

Hmm do you mean the methods SaveAsFanart, SaveAsPoster in clsAPImages? I think they all using the Save method which I edited or I'm missing something?

EDIT: Btw I'm not using Ember with Series, so I can't check it out if thats the problem :)

Cocotus commented 11 years ago

I have rescraped about 50 movies the last days and I will check/compare if there are differences. Thanks for the hints. Also good job on working on the new scraper, I think it's definately a good idea :)