dougwaldron / podcast-rewind

An app for rewinding a podcast so you can listen to it as if it's new!
https://podcast-rewind.azurewebsites.net/
The Unlicense
1 stars 2 forks source link

XmlException: An error was encountered when parsing a DateTime value in the XML #75

Open dougwaldron opened 8 months ago

dougwaldron commented 8 months ago

Reported via Sentry:

Affected page: https://podcast-rewind.azurewebsites.net/Setup?feedUrl=https%3A%2F%2Fdaringfireball.net%2Fthetalkshow%2Frss

Feed URL: https://daringfireball.net/thetalkshow/rss

Exception:

System.Xml.XmlException: An error was encountered when parsing a DateTime value in the XML.
Full stack trace: ``` System.Xml.XmlException: Error in line 754 position 43. An error was encountered when parsing a DateTime value in the XML. File "SyndicationItem.cs", line 138, in DateTimeOffset SyndicationItem.get_PublishDate() throw PublishDateException; File "Models/Dto/ViewPodcastEpisodeDto.cs", line 13, col 56, in new ViewPodcastEpisodeDto(SyndicationItem item) File "Pages/Setup.cshtml.cs", line 60, col 53, in void SetupModel.LoadData(SyndicationFeed feed)+(SyndicationItem item) => { } [0] File "Select.SpeedOpt.cs", line 428, in void SelectIListIterator.Fill(IList source, Span results, Func func) results[i] = func(source[i]); File "Select.SpeedOpt.cs", line 408, in TResult[] SelectIListIterator.ToArray() Fill(_source, results, _selector); File "Buffer.cs", line 32, in new Buffer(IEnumerable source) TElement[] array = iterator.ToArray(); File "OrderedEnumerable.SpeedOpt.cs", line 29, in List OrderedEnumerable.ToList() Buffer buffer = new Buffer(_source); File "ToCollection.cs", line 24, in List Enumerable.ToList(IEnumerable source) if (source == null) File "Pages/Setup.cshtml.cs", line 60, col 9, in void SetupModel.LoadData(SyndicationFeed feed) File "Pages/Setup.cshtml.cs", line 27, col 9, in async Task SetupModel.OnGetAsync(string feedUrl, int interval) File "ExecutorFactory.cs", line 139, in async Task GenericTaskHandlerMethod.Convert(object taskAsObject) return await task; File "ExecutorFactory.cs", line 132, in async Task GenericTaskHandlerMethod.Execute(object receiver, object[] arguments) var result = await _thunk(receiver, arguments); File "PageActionInvoker.cs", line 274, in async Task PageActionInvoker.InvokeHandlerMethodAsync() _result = await executor(_instance, arguments); File "PageActionInvoker.cs", line 659, in async Task PageActionInvoker.InvokeNextPageFilterAsync() await Next(ref next, ref scope, ref state, ref isCompleted); File "PageActionInvoker.cs", line 706, in void PageActionInvoker.Rethrow(PageHandlerExecutedContext context) context.ExceptionDispatchInfo?.Throw(); File "PageActionInvoker.cs", line 633, in Task PageActionInvoker.Next(ref State next, ref Scope scope, ref object state, ref bool isCompleted) Rethrow(handlerExecutedContext); File "PageActionInvoker.cs", line 83, in async Task PageActionInvoker.InvokeInnerFilterAsync() await Next(ref next, ref scope, ref state, ref isCompleted); File "ResourceInvoker.cs", line 978, in async Task ResourceInvoker.InvokeNextResourceFilter()+Awaited(?) await lastTask; File "ResourceInvoker.cs", line 1460, in void ResourceInvoker.Rethrow(ResourceExecutedContextSealed context) } File "ResourceInvoker.cs", line 890, in Task ResourceInvoker.Next(ref State next, ref Scope scope, ref object state, ref bool isCompleted) Rethrow(_resourceExecutedContext!); File "ResourceInvoker.cs", line 254, in async Task ResourceInvoker.InvokeFilterPipelineAsync()+Awaited(?) await invoker.Next(ref next, ref scope, ref state, ref isCompleted); File "ResourceInvoker.cs", line 124, in async Task ResourceInvoker.InvokeAsync()+Logged(?) x 2 await invoker.InvokeFilterPipelineAsync(); ?, in async Task SentryTracingMiddleware.InvokeAsync(HttpContext context) x 2 File "ExceptionHandlerMiddlewareImpl.cs", line 98, in async Task ExceptionHandlerMiddlewareImpl.Invoke(HttpContext context)+Awaited(?) await task; ```
dougwaldron commented 8 months ago

Well that was fun evening diving into the code and specs!

TLDR; It looks like the .NET SyndicationFeed parser incorrectly throws an exception when the date uses a single digit day.


The RSS feed for The Talk Show uses single-digit days (when less than 10). The SyndicationFeed parser has no problem with this episode:

<guid>https://daringfireball.net/thetalkshow/2023/06/17/ep-379</guid>
<pubDate>Sat, 17 Jun 2023 19:22:55 EDT</pubDate>

But fails for the next episode:


<guid>https://daringfireball.net/thetalkshow/2023/06/07/ep-378</guid>
<pubDate>Wed, 7 Jun 2023 20:13:43 EDT</pubDate>

with the exception message:

An error was encountered when parsing a DateTime value in the XML.


Diving into the code...

When SyndicationFeed tries load an RSS feed, it first attempts to parse publication dates using the default DateTimeOffset parser. If that doesn't work, it uses DateTimeOffset.TryParseExact with an array of possible date formats based on RFC 822.

The RSS 2.0 spec does specify that pubDate should conform to RFC 822 (except that years should be 4 digits).

RFC 822 in turn specifies that the date portion of the date-time value be formatted like so:

date        =  1*2DIGIT month 2DIGIT        ; day month year
                                            ;  e.g. 20 Jun 82

where that 1*2DIGIT notation indicates at least 1 and at most 2 digits for the day value. So single-digit dates are valid.

But the parseFormat array includes no formats with single-digit days. This is a bug.

dougwaldron commented 8 months ago
dougwaldron commented 8 months ago
dougwaldron commented 1 month ago

I updated the Sentry.io link in the original description since the old one had expired.