Closed dannyroosevelt closed 1 year ago
The source with the reported issue is using the rss_new-item-in-feed
v0.0.1, but I believe the newest version 0.0.2 might have fixed the bug.
Advising the customer to recreate the source.
I was unable to recreate the bug in a v0.0.2 version of the component.
Adding in example problem workflow.
I believe the problem may be a misordering of events, not that the RSS feed isn't emitting new events:
The latest post in this RSS feed is May 19th, but the latest event emitted is corresponding with a post from May 10th
It might be that the feedparser
reading this XML feed is ordering in alphabetical order on the guid
property
The guid
's in chronological order:
https://monospacedmonologues.com/2022/05/lessons-from-a-failed-startup-corporate-values/
https://monospacedmonologues.com/2022/05/lessons-from-a-failed-startup-do-research/
https://monospacedmonologues.com/2022/05/how-to-drive-fast/
But the sources event logs show that the How to drive fast article is first, which is hinting to me that the guid
's are sorted into an alphanumeric order, which is causing the issue.
@alysonturing does this make sense?
I'm not as familiar with the feedparser
module, is it possible that it's using the guid
property as a primary ID and sorting it before parsing each item to the the readable
function?
I'm not sure I follow why the order would matter; it seems to be using the "unique" dedup strategy. Is there something that would deduplicate older values regardless?
@SamirTalwar I just published a new version of the RSS source. Could you do me a favor and try to create a new source at https://pipedream.com/new/sources ?
If that doesn't work, can you visit the Logs tab of the source and share any logs that appear there?
@alysonturing does this make sense?
I'm not as familiar with the
feedparser
module, is it possible that it's using theguid
property as a primary ID and sorting it before parsing each item to the thereadable
function?
Hey sorry, I just saw your question, I believe that the guid is not generated by the feedparser
so we can't say for sure which is the method that generates it
I just deleted and recreated the source. The events still seem to be in the wrong order.
The latest event just went out successfully but I expect that's because its GUID now contains "2022/06", not "2022/05", which suggests your theory about sorting by GUID holds water.
The logs are as follows:
2022-06-02T18:17:15 End
2022-06-02T18:17:12 Start
{
"timestamp": 1654186631,
"timezone_utc": {
"date": {
"day": 2,
"month": 6,
"year": 2022
},
"iso8601": {
"date": "2022-06-02",
"time": "16:17:11+00:00",
"timestamp": "2022-06-02T16:17:11+00:00"
},
"metadata": {
"day_name": "Thursday",
"day_of_week": 4,
"start_of_week": "2022-05-30"
},
"pretty": {
"date": "Jun 2, 2022",
"time": "4:06:11 PM",
"time_24h": "16:17:11"
},
"time": {
"hour": 16,
"millisecond": 841,
"minute": 17,
"second": 11
},
"timezone": "UTC"
},
"timezone_configured": {
"date": {
"day": 2,
"month": 6,
"year": 2022
},
"iso8601": {
"date": "2022-06-02",
"time": "16:17:11+00:00",
"timestamp": "2022-06-02T16:17:11+00:00"
},
"metadata": {
"day_name": "Thursday",
"day_of_week": 4,
"start_of_week": "2022-05-30"
},
"pretty": {
"date": "Jun 2, 2022",
"time": "4:06:11 PM",
"time_24h": "16:17:11"
},
"time": {
"hour": 16,
"millisecond": 841,
"minute": 17,
"second": 11
},
"timezone": "UTC"
},
"interval_seconds": 900
}
2022-06-02T18:17:08 activate
@SamirTalwar thanks for the detail. @alysonturing is going to look into it!
@dannyroosevelt
This issue and this pull are the same thing, but both are in the columns, can you check this to remove one?
@dannyroosevelt
This issue and this pull are the same thing, but both are in the columns, can you check this to remove one?
You have better context on those 2 issues, feel free to remove one that we don't need.
This is ready for release!
Is this going to be magically fixed for all users of the RSS source, or do we need to do something?
From what I know, The user need to recreate the source to update it to the new version
@SamirTalwar That’s correct, try adding a new RSS source and let us know if that works!
Looks like it's still broken on my end. It worked last week but not this one.
@SamirTalwar is it still emitting events in the wrong order, or are you seeing some different behavior? Do you have example logs / items that aren't emitted correctly?
And just to confirm, is it still on this feed?
@SamirTalwar have you tried creating new source? tried with different browsers like Safari, Chrome, Firefox? can you please try that and let us know if you still see the error and share logs with us and some more informative stuff would be very helpful
I have tried deleting and recreating the source, both on 9th June and just now. However, it clearly hasn't worked; I am not seeing the changes to rss.app.ts from #3084. Are these changes actually deployed?
Recreating the source has made the latest item show up, but it seems to be fairly random.
I appreciate that you folks need some input but don't you have access to the logs already? The feed and the Pipedream workflow are still the same.
(I haven't tried with different browsers but I cannot see how this would be relevant to getting the correct version of the RSS source.)
@SamirTalwar It wasn't clear to me that this was on the same workflow. We deal with many support issues and are constantly jumping in and out of context, so we may not get it right every time. I appreciate the patience as we investigate!
The new source indeed was not published. We're re-publishing now and working on a better way to catch these cases in the future! I'll test once it's out, and you can give it a try then.
Just opened a new PR adding support for JSON Feed URLS, and also fixing the sorting on RSS sources, it should help the issues pointed here. https://github.com/PipedreamHQ/pipedream/pull/3192
New items are not getting picked up in the RSS trigger.
Workflow (shared w/ Support): https://pipedream.com/@samirtalwar/tweet-blog-posts-p_PACJno3/edit
RSS feed: https://monospacedmonologues.com/index.xml