Open lahma0 opened 10 months ago
I actually figured out a solution (after spending literally the entire day messing with this). I will post here how I fixed the issue in case anyone else comes along and wants to know the solution. Just follow these steps:
#initially taken from: https://github.com/HumanEquivalentUnit/PowerShell-Misc/blob/03cbe7deaa545a96e5102777877ac929c5b67a62/Get-Win10OcrTextFromImage.ps1
using namespace Windows.Storage
using namespace Windows.Graphics.Imaging
$ErrorActionPreference = "Stop"
$VerbosePreference = "SilentlyContinue"
Add-Type -AssemblyName Microsoft.Windows.SDK.NET.dll Add-Type -AssemblyName WinRT.Runtime.dll
$getAwaiterBaseMethod = [WindowsRuntimeSystemExtensions].GetMember('GetAwaiter'). Where({ $PSItem.GetParameters()[0].ParameterType.Name -eq 'IAsyncOperation`1' }, 'First')[0]
Function Await { param($AsyncTask, $ResultType)
$getAwaiterBaseMethod.
MakeGenericMethod($ResultType).
Invoke($null, @($AsyncTask)).
GetResult()
}
Function Show-SupportedLanguages { [CmdletBinding()] param($userLang)
Write-Verbose "The language $userLang is not supported"
Write-Verbose "Here is a list of possible languages for the OCR engine:"
[Windows.Media.Ocr.OcrEngine]::AvailableRecognizerLanguages | % { Write-Verbose $_.LanguageTag }
}
$memo = @{}
Function Get-Text-OCR { [CmdletBinding()] param($Path, [string]$language, [bool]$runAnywayWithBadLanguage=$false)
$lng = @([Windows.Media.Ocr.OcrEngine]::AvailableRecognizerLanguages | ? {($_.LanguageTag -ieq $language) -or ($_.LanguageTag.Split("-")[0] -ieq $language) })[0]
if ($lng -eq $null)
{
if (-not $runAnywayWithBadLanguage)
{
Show-SupportedLanguages -userLang $language
return
}
else
{
if ($memo.ContainsKey("UserProfileLanguages")) {
$ocrEngine = $memo["UserProfileLanguages"]
} else {
$ocrEngine = [Windows.Media.Ocr.OcrEngine]::TryCreateFromUserProfileLanguages()
$memo.Add("UserProfileLanguages", $ocrEngine)
}
}
}
else
{
if ($memo.ContainsKey($lng.LanguageTag)) {
$ocrEngine = $memo[$lng.LanguageTag]
} else {
$ocrEngine = [Windows.Media.Ocr.OcrEngine]::TryCreateFromLanguage($lng)
$memo.Add($lng.LanguageTag, $ocrEngine)
}
}
foreach ($p in $Path)
{
# From MSDN, the necessary steps to load an image are:
# Call the OpenAsync method of the StorageFile object to get a random access stream containing the image data.
# Call the static method BitmapDecoder.CreateAsync to get an instance of the BitmapDecoder class for the specified stream.
# Call GetSoftwareBitmapAsync to get a SoftwareBitmap object containing the image.
#
# https://docs.microsoft.com/en-us/windows/uwp/audio-video-camera/imaging#save-a-softwarebitmap-to-a-file-with-bitmapencoder
# .Net method needs a full path, or at least might not have the same relative path root as PowerShell
$p = $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($p)
$params = @{
AsyncTask = [Windows.Storage.StorageFile]::GetFileFromPathAsync($p)
ResultType = [Windows.Storage.StorageFile]
}
$storageFile = Await @params
$params = @{
AsyncTask = $storageFile.OpenAsync([Windows.Storage.FileAccessMode]::Read)
ResultType = [Windows.Storage.Streams.IRandomAccessStream]
}
$fileStream = Await @params
$params = @{
AsyncTask = [Windows.Graphics.Imaging.BitmapDecoder]::CreateAsync($fileStream)
ResultType = [Windows.Graphics.Imaging.BitmapDecoder]
}
$bitmapDecoder = Await @params
$params = @{
AsyncTask = $bitmapDecoder.GetSoftwareBitmapAsync()
ResultType = [Windows.Graphics.Imaging.SoftwareBitmap]
}
$softwareBitmap = Await @params
# Run the OCR
Await $ocrEngine.RecognizeAsync($softwareBitmap) ([Windows.Media.Ocr.OcrResult])
}
}
2. Download the 2 DLL files ('Microsoft.Windows.SDK.NET.dll' and 'WinRT.Runtime.dll') from the following link into that same 'Resources' directory:
https://github.com/Windos/BurntToast/tree/main/BurntToast/lib/Microsoft.Windows.SDK.NET
3. Now, look at 'Solution Explorer' in Visual Studio and you will see that a 'Resources' directory was automatically added to your project and contains the 3 files you put into that directory. Select each file (one at a time), find the 'Properties' pane, and change 'Copy to Output Directory' to 'Copy always'. You can also change 'Build Action' to 'Embedded Resource' if you want these 3 files to be stored within your assembly. If you do this, you can write code to automatically write the 3 files to the 'Resources' directory upon application startup so you don't have to distribute the files with your app's binary. That's up to you though. Here's the relevant sections from my .csproj file for your reference:
4. Add the 'Microsoft.PowerShell.SDK' nuget package to your project. For example, here's the relevant section from my .csproj file:
5. I'm not sure if this last step is necessary or not but I'm mentioning it just in case it is. You may need to modify 'TargetFramework' in your .csproj file from (for example) 'net7.0' to (for example) 'net7.0-windows10.0.19041.0'. There's a number of different Windows 10/11 versions you can target which should work but that just happens to be the version I am currently targeting. Here's the relevant section from my .csproj file which has another possible 'TargetFramework' version commented out (just so you can see one of the other options):
Now, Powershell 7 will be able to successfully utilize the needed WinRT types and should not throw an exception whenever you create a new instance of 'WindowsOcrService':
var service = new WindowsOcrService();
P.S. Thanks to the project authors for writing this awesome library. It is really nice to have all of these OCR engines consolidated into a single library! If you would like for me to push the changes required to make this project multi-target both .NET Standard (as it already is) and .NET Core (with the changes I've mentioned), just let me know. I would be happy to contribute the necessary changes.
Is there any way to use this package in a project targeting .NET Core (or just .NET now)? When trying to create a new instance of WindowsOcrService in a project targeting .NET 7, without any other nuget packages or references (other than 'Ocr.Wrapper'), I get the following exception:
If I reference the nuget package 'Microsoft.PowerShell.SDK' (v6.x.x or v7.x.x), I instead get this exception:
As far as I can tell, the problem is related to the 'Get-Text-Win-OCR.ps1' script being run within Powershell 7 (instead of Powershell 5) when the project is targeting .NET Core (.NET 5/6/7). Is there any solution to this other than switching my project to target .NET Framework? Thanks!