MihaiTheCoder / ocr-all-in-one

A repository that helps to easily test what OCR is best for your documents
MIT License
13 stars 6 forks source link

Can this be used with .NET Core (.NET 6/7)? #24

Open lahma0 opened 10 months ago

lahma0 commented 10 months ago

Is there any way to use this package in a project targeting .NET Core (or just .NET now)? When trying to create a new instance of WindowsOcrService in a project targeting .NET 7, without any other nuget packages or references (other than 'Ocr.Wrapper'), I get the following exception:

System.IO.FileNotFoundException: 
'Could not load file or assembly 'System.Management.Automation, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'. The system cannot find the file specified.'

If I reference the nuget package 'Microsoft.PowerShell.SDK' (v6.x.x or v7.x.x), I instead get this exception:

System.Management.Automation.RuntimeException: 
'Unable to find type [Windows.Storage.StorageFile,Windows.Storage, ContentType=WindowsRuntime].'

As far as I can tell, the problem is related to the 'Get-Text-Win-OCR.ps1' script being run within Powershell 7 (instead of Powershell 5) when the project is targeting .NET Core (.NET 5/6/7). Is there any solution to this other than switching my project to target .NET Framework? Thanks!

lahma0 commented 10 months ago

I actually figured out a solution (after spending literally the entire day messing with this). I will post here how I fixed the issue in case anyone else comes along and wants to know the solution. Just follow these steps:

  1. First, you will likely get an exception about your app not being able to locate 'Resources\Get-Text-Win-OCR.ps1' because of the way Ocr.Wrapper tries to load this file. That's ok though because the 'Get-Text-Win-OCR.ps1' file included in Ocr.Wrapper will not work with .NET Core/.NET apps anyways. To fix this, create a directory named 'Resources' in your project directory (project directory, not solution directory) and then save the following content into a file named 'Get-Text-Win-OCR.ps1' in that directory:
    
    #initially taken from: https://github.com/HumanEquivalentUnit/PowerShell-Misc/blob/03cbe7deaa545a96e5102777877ac929c5b67a62/Get-Win10OcrTextFromImage.ps1
    using namespace Windows.Storage
    using namespace Windows.Graphics.Imaging
    $ErrorActionPreference = "Stop"
    $VerbosePreference = "SilentlyContinue"

Add-Type -AssemblyName Microsoft.Windows.SDK.NET.dll Add-Type -AssemblyName WinRT.Runtime.dll

PowerShell doesn't have built-in support for Async operations,

but all the WinRT methods are Async.

This function wraps a way to call those methods, and wait for their results.

$getAwaiterBaseMethod = [WindowsRuntimeSystemExtensions].GetMember('GetAwaiter'). Where({ $PSItem.GetParameters()[0].ParameterType.Name -eq 'IAsyncOperation`1' }, 'First')[0]

Function Await { param($AsyncTask, $ResultType)

$getAwaiterBaseMethod.
    MakeGenericMethod($ResultType).
    Invoke($null, @($AsyncTask)).
    GetResult()

}

Function Show-SupportedLanguages { [CmdletBinding()] param($userLang)

Write-Verbose "The language $userLang is not supported"
Write-Verbose "Here is a list of possible languages for the OCR engine:"
[Windows.Media.Ocr.OcrEngine]::AvailableRecognizerLanguages | % { Write-Verbose $_.LanguageTag }

}

$memo = @{}

Function Get-Text-OCR { [CmdletBinding()] param($Path, [string]$language, [bool]$runAnywayWithBadLanguage=$false)

$lng = @([Windows.Media.Ocr.OcrEngine]::AvailableRecognizerLanguages | ? {($_.LanguageTag -ieq $language) -or ($_.LanguageTag.Split("-")[0] -ieq $language) })[0]

if ($lng -eq $null)
{
    if (-not $runAnywayWithBadLanguage)
    {
        Show-SupportedLanguages -userLang $language
        return
    }
    else
    {
        if ($memo.ContainsKey("UserProfileLanguages")) {
            $ocrEngine = $memo["UserProfileLanguages"]
        } else {
            $ocrEngine = [Windows.Media.Ocr.OcrEngine]::TryCreateFromUserProfileLanguages()
            $memo.Add("UserProfileLanguages", $ocrEngine)
        }
    }
}
else
{
    if ($memo.ContainsKey($lng.LanguageTag)) {
            $ocrEngine = $memo[$lng.LanguageTag]
        } else {
            $ocrEngine = [Windows.Media.Ocr.OcrEngine]::TryCreateFromLanguage($lng)
            $memo.Add($lng.LanguageTag, $ocrEngine)
        }
}

foreach ($p in $Path)
{
    # From MSDN, the necessary steps to load an image are:
    # Call the OpenAsync method of the StorageFile object to get a random access stream containing the image data.
    # Call the static method BitmapDecoder.CreateAsync to get an instance of the BitmapDecoder class for the specified stream.
    # Call GetSoftwareBitmapAsync to get a SoftwareBitmap object containing the image.
    #
    # https://docs.microsoft.com/en-us/windows/uwp/audio-video-camera/imaging#save-a-softwarebitmap-to-a-file-with-bitmapencoder

    # .Net method needs a full path, or at least might not have the same relative path root as PowerShell
    $p = $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($p)

    $params = @{
        AsyncTask  = [Windows.Storage.StorageFile]::GetFileFromPathAsync($p)
        ResultType = [Windows.Storage.StorageFile]
    }
    $storageFile = Await @params

    $params = @{
        AsyncTask  = $storageFile.OpenAsync([Windows.Storage.FileAccessMode]::Read)
        ResultType = [Windows.Storage.Streams.IRandomAccessStream]
    }
    $fileStream = Await @params

    $params = @{
        AsyncTask  = [Windows.Graphics.Imaging.BitmapDecoder]::CreateAsync($fileStream)
        ResultType = [Windows.Graphics.Imaging.BitmapDecoder]
    }
    $bitmapDecoder = Await @params

    $params = @{
        AsyncTask = $bitmapDecoder.GetSoftwareBitmapAsync()
        ResultType = [Windows.Graphics.Imaging.SoftwareBitmap]
    }
    $softwareBitmap = Await @params

    # Run the OCR
    Await $ocrEngine.RecognizeAsync($softwareBitmap) ([Windows.Media.Ocr.OcrResult])
}

}


2. Download the 2 DLL files ('Microsoft.Windows.SDK.NET.dll' and 'WinRT.Runtime.dll') from the following link into that same 'Resources' directory:
https://github.com/Windos/BurntToast/tree/main/BurntToast/lib/Microsoft.Windows.SDK.NET

3. Now, look at 'Solution Explorer' in Visual Studio and you will see that a 'Resources' directory was automatically added to your project and contains the 3 files you put into that directory. Select each file (one at a time), find the 'Properties' pane, and change 'Copy to Output Directory' to 'Copy always'. You can also change 'Build Action' to 'Embedded Resource' if you want these 3 files to be stored within your assembly. If you do this, you can write code to automatically write the 3 files to the 'Resources' directory upon application startup so you don't have to distribute the files with your app's binary. That's up to you though. Here's the relevant sections from my .csproj file for your reference:
Always Always Always

4. Add the 'Microsoft.PowerShell.SDK' nuget package to your project. For example, here's the relevant section from my .csproj file:

5. I'm not sure if this last step is necessary or not but I'm mentioning it just in case it is. You may need to modify 'TargetFramework' in your .csproj file from (for example) 'net7.0' to (for example) 'net7.0-windows10.0.19041.0'. There's a number of different Windows 10/11 versions you can target which should work but that just happens to be the version I am currently targeting. Here's the relevant section from my .csproj file which has another possible 'TargetFramework' version commented out (just so you can see one of the other options):
Exe net7.0-windows10.0.19041.0 enable enable


Now, Powershell 7 will be able to successfully utilize the needed WinRT types and should not throw an exception whenever you create a new instance of 'WindowsOcrService':
var service = new WindowsOcrService();

P.S. Thanks to the project authors for writing this awesome library. It is really nice to have all of these OCR engines consolidated into a single library! If you would like for me to push the changes required to make this project multi-target both .NET Standard (as it already is) and .NET Core (with the changes I've mentioned), just let me know. I would be happy to contribute the necessary changes.