In Morofologik.Stemming, we encountered a problem with testing netstandard2.0 on net471 - it fails to load System.Text.Encoding.CodePages.dll and crashes. This no doubt also affects the following modules:
Lucene.Net.Analysis.Kuromoji
Lucene.Net.Analysis.SmartCn
Note that Hunspell in Lucene.Net.Analysis.Common also requires System.Text.Encoding.CodePages when loading some dictionaries, but users are expected to add a reference to their project if, and only if, they require it.
Expected Behavior
.NET Framework should be able to use a netstandard2.0 assembly without receiving an error message.
Steps To Reproduce
This occurred when we upgraded Morfologik.Stemming to net9.0 and also added targets for net8.0 and net9.0, thus requiring us to test netstandard2.0 on something else. We chose net471 and encountered this problem. It isn't clear why we are not seeing this in Lucene.Net, but we definitely should be checking the runtime before registering an encoding provider and we are currently not.
Exceptions (if any)
System.IO.FileNotFoundException : Could not load file or assembly 'System.Text.Encoding.CodePages, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
Lucene.NET Version
4.8.0-beta00017
.NET Version
.NET Framework (the version we test netstandard2.0 with)
Operating System
N/A
Anything else?
This is happening because .NET Framework doesn't require this registration, however our conditional compilation only checks whether the target framework supports FEATURE_ENCODINGPROVIDERS, it does not check the actual runtime being used. In Morfologik.Stemming, this was addressed using the following class, which is called from static constructors on all of the types that require the encoding.
using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading;
namespace Morfologik.Stemming.Support
{
/// <summary>
/// Loads the <see cref="System.Text.EncodingProvider"/> for the current runtime for support of
/// iso-8859-1 encoding.
/// </summary>
internal static class EncodingProviderInitializer
{
private static int initialized;
private static bool IsNetFramework =>
#if NETSTANDARD2_0
RuntimeInformation.FrameworkDescription.StartsWith(".NET Framework", StringComparison.OrdinalIgnoreCase);
#elif NET40_OR_GREATER
true;
#else
false;
#endif
[Conditional("FEATURE_ENCODINGPROVIDERS")]
public static void EnsureInitialized()
{
// Only allow a single thread to call this
if (0 != Interlocked.CompareExchange(ref initialized, 1, 0)) return;
#if FEATURE_ENCODINGPROVIDERS
if (!IsNetFramework)
{
Initialize();
}
#endif
}
#if FEATURE_ENCODINGPROVIDERS
// NOTE: CodePagesEncodingProvider.Instance loads early, so we need this in a separate method to ensure
// that it isn't executed until after we know which runtime we are on.
[MethodImpl(MethodImplOptions.NoInlining)]
private static void Initialize()
{
// Support for iso-8859-1 encoding. See: https://docs.microsoft.com/en-us/dotnet/api/system.text.codepagesencodingprovider?view=netcore-2.0
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
}
#endif
}
}
Is there an existing issue for this?
Describe the bug
In Morofologik.Stemming, we encountered a problem with testing
netstandard2.0
onnet471
- it fails to loadSystem.Text.Encoding.CodePages.dll
and crashes. This no doubt also affects the following modules:Expected Behavior
.NET Framework should be able to use a
netstandard2.0
assembly without receiving an error message.Steps To Reproduce
This occurred when we upgraded Morfologik.Stemming to
net9.0
and also added targets fornet8.0
andnet9.0
, thus requiring us to testnetstandard2.0
on something else. We chosenet471
and encountered this problem. It isn't clear why we are not seeing this in Lucene.Net, but we definitely should be checking the runtime before registering an encoding provider and we are currently not.Exceptions (if any)
System.IO.FileNotFoundException : Could not load file or assembly 'System.Text.Encoding.CodePages, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
Lucene.NET Version
4.8.0-beta00017
.NET Version
.NET Framework (the version we test netstandard2.0 with)
Operating System
N/A
Anything else?
This is happening because .NET Framework doesn't require this registration, however our conditional compilation only checks whether the target framework supports
FEATURE_ENCODINGPROVIDERS
, it does not check the actual runtime being used. In Morfologik.Stemming, this was addressed using the following class, which is called from static constructors on all of the types that require the encoding.