CXuesong / WikiClientLibrary

/*🌻*/ Wiki Client Library is an asynchronous MediaWiki API client library targeting modern .NET platforms
https://github.com/CXuesong/WikiClientLibrary/wiki
Apache License 2.0
80 stars 16 forks source link

Dealing with invalid dates #49

Closed NateKomodo closed 5 years ago

NateKomodo commented 5 years ago

On certain wiki pages, such as "Elon Musk", the following error occurs when executing await page.RefreshAsync(PageQueryOptions.FetchContent | PageQueryOptions.ResolveRedirects); On a page (Note the code i am using works on all pages except from a few, such as the above mentioned). Error: Newtonsoft.Json.JsonSerializationException: Error setting value to 'ExpiryProxy' on 'WikiClientLibrary.Pages.ProtectionInfo'. ---> System.FormatException: String was not recognized as a valid DateTime.

CXuesong commented 5 years ago

I think the value infinity might cause the trouble. However this should already been properly taken care of. Can you provide me with the version of WCL package and the .NET runtime version?

CXuesong commented 5 years ago

I've used the following code to test your case with WCL 0.7.0-int.2 on .NET Core 2.1, and it works. If possible, can you provide the HTTP response (action=query) that may cause the error? Thanks!

using System;
using System.Threading.Tasks;
using WikiClientLibrary.Client;
using WikiClientLibrary.Pages;
using WikiClientLibrary.Sites;

namespace TestConsoleApp1
{
    class Program
    {
        static async Task Main(string[] args)
        {
            using (var client = new WikiClient())
            {
                var site = new WikiSite(client, "https://en.wikipedia.org/w/api.php");
                await site.Initialization;
                var page = new WikiPage(site, "Elon Musk");
                await page.RefreshAsync(PageQueryOptions.FetchContent | PageQueryOptions.ResolveRedirects);
                Console.WriteLine(string.Join("\n", page.Protections));
            }
        }
    }
}
NateKomodo commented 5 years ago

Here is the intercepted response from the wiki: https://imgur.com/W83xgMy

I am using the latest WCL version and the latest .NET framework (console app) version

CXuesong commented 5 years ago

Changed CultureInfo.CurrentCulture to en-gb and see the following exception. Guess I need to specify the culture explicitly when parsing the timestamp.

Unhandled Exception: Newtonsoft.Json.JsonSerializationException: Error setting value to 'ExpiryProxy' on 'WikiClientLibrary.Pages.ProtectionInfo'. ---> System.FormatException: String '06/13/2021 18:14:36' was not recognized as a valid DateTime.
   at System.DateTimeParse.Parse(ReadOnlySpan`1 s, DateTimeFormatInfo dtfi, DateTimeStyles styles)
   at System.DateTime.Parse(String s, IFormatProvider provider, DateTimeStyles styles)
   at lambda_method(Closure , Object , Object )
   at Newtonsoft.Json.Serialization.ExpressionValueProvider.SetValue(Object target, Object value)
   --- End of inner exception stack trace ---
   at Newtonsoft.Json.Serialization.ExpressionValueProvider.SetValue(Object target, Object value)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.SetPropertyValue(JsonProperty property, JsonConverter propertyConverter, JsonContainerContract containerContract, JsonProperty containerProperty, JsonReader reader, Object target)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.PopulateObject(Object newObject, JsonReader reader, JsonObjectContract contract, JsonProperty member, String id)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateObject(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.PopulateList(IList list, JsonReader reader, JsonArrayContract contract, JsonProperty containerProperty, String id)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateList(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, Object existingValue, String id)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)
   at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)
   at Newtonsoft.Json.Linq.JToken.ToObject(Type objectType, JsonSerializer jsonSerializer)
   at Newtonsoft.Json.Linq.JToken.ToObject[T](JsonSerializer jsonSerializer)
   at WikiClientLibrary.Pages.Queries.Properties.PageInfoPropertyGroup..ctor(JObject jPage)
   at WikiClientLibrary.Pages.Queries.Properties.PageInfoPropertyProvider.ParsePropertyGroup(JObject json)
   at WikiClientLibrary.Pages.Queries.WikiPageQueryProvider.ParsePropertyGroups(JObject json)+MoveNext()
   at WikiClientLibrary.Pages.WikiPage.OnLoadPageInfo(JObject jpage, IWikiPageQueryProvider options)
   at WikiClientLibrary.RequestHelper.RefreshPagesAsync(IEnumerable`1 pages, IWikiPageQueryProvider options, CancellationToken cancellationToken)
   at TestConsoleApp1.Program.Main(String[] args)
   at TestConsoleApp1.Program.<Main>(String[] args)
NateKomodo commented 5 years ago

is there an ETA for the fix? Looking at when the previous commits where, seems like the repo is stagnant right now

CXuesong commented 5 years ago

I think we are sufferer from JamesNK/Newtonsoft.Json#862 (and I'm reading the post later… ugh). Basically, if you parse a string containing date expression into JValue then convert it back to string, you will find the string has been reformatted. See https://dotnetfiddle.net/rtp98c .

Console.WriteLine(JToken.Parse("\"2019-01-15T10:10:10Z\"").ToObject<string>());

(seems to be without culture-awareness) gives you

01/15/2019 10:10:10

And https://github.com/CXuesong/WikiClientLibrary/blob/8db7bed5fe55df9470713a54fec5b0406d4dd108/WikiClientLibrary/MediaWikiUtility.cs#L133-L138 uses current culture to parse the re-formatted date string (this is unexpected). In en-gb, unfortunately, there is no mm/dd/yyyy date patterns (I've checked CultureInfo.GetCultureInfo("en-gb").DateTimeFormat.GetAllDateTimePatterns()), and it caused Exception.

CXuesong commented 5 years ago

Published v0.7.0-int.3. You may see it on NuGet in a few minutes.

Actually I'm bringing it out of stagnancy right now 😉

NateKomodo commented 5 years ago

👍

CXuesong commented 5 years ago

Closed the issue for now. If you bump into similar problems, feel free to open it again!