pipes-digital / pipes

Repository for Pipes
https://pipes.digital
GNU Affero General Public License v3.0
264 stars 21 forks source link

Notice: Internal block function changes #141

Open onli opened 5 months ago

onli commented 5 months ago

tldr: pipes.digital just got a significant update: blocks now work differently. The aim is to improve performance and stability. But if your pipe broke because of those changes, please let me know, it can probably be fixed.

The changes there are supposed to be added to the CE version as well, in the near future.

Context

For a while now, the public instance of Pipes hasn't been all that stable. I battled against that, first with a better server, then a search for bug, then with timeouts and more monitoring. All measures helped for a while. But recently it got worse again, which prompted me to work on some more fundamental changes.

The change

Before, blocks basically were completely independents. Each block used feedparser to transform their input, which was given to them as a raw string datatype. The block then did its work, manipulated the feed, and afterwards sent the feed as a String to the next block.

This meant a continuous reparsing of feeds into feedparser/RSS objects, which is exactly the activity where Pipes sometimes still got stuck.

The new system avoids a lot of this work by not expecting Strings anymore. Instead, only the feed block and the service integration blocks will parse their input/given link via the feedparser gem. All other blocks expect an already formed RSS object (as in: ruby/rss). The blocks then works with this object to create the modified RSS object, which is then given to the next block, not as a String.

Possible issues

Two sources of issues might occur:

  1. Some of the blocks had to be almost completely rewritten, so I might have missed something when recreating them.
  2. The download block before was internally interchangeable with the feed block, as both of them in the end piped a raw String to the next block, which then parsed the input. Now, the download block really needs to be combined with an extract block or a feed builder block (for line by line feed creation), all other blocks will not be compatible. This was already enforced in the editor for a while, but really early pipes from before the restriction could still have a direct connection from a download block to one other blocks. Those pipes will break now.

So if you encounter strange behaviour or plain bugs, please let me know, either here or in a separate issue.

baloo-gh commented 5 months ago

Hi,

I love your service - it has made my life so much simpler and better by being able to focus on what is relevant across multiple feeds. This happened about 2 days ago, and hence I speculate it may be linked to the rewriting which I just read about here.

In a nutshell, the output feed I get is an "Internal Server Error" for both of the feeds I am running. From the limited debugging I could do I think it is linked to the "Sort" block. Take this one as example here:

https://www.pipes.digital/feedpreview/l9vp81O1

Right now, I am not using the "Sort" in this pipe, and things work as expected. Once you re-connect the "Sort" block again, then you get an "Internal Server Error".

Hope this helps you finding the glitch :-)

Thanks again, super service!!!!

Thomas

onli commented 5 months ago

Hi @baloo-gh

Thanks for the nice words! And thanks for the bug report.

You are exactly right about the error, it is related to the change described above. Because the object the sort block now works with has a different structure, the published field it tried to work with does not exist anymore. I swapped it over to pubDate. If you connect the sort block it should work now again. Pls let me know whether it's correct now.

baloo-gh commented 5 months ago

Hi @onli

Thanks for the superfast reply and fix! I can confirm that "Sort" now works again; my "lighter" pipe is working again as before.

My more complex 2nd piple (https://www.pipes.digital/feedpreview/79Lg8JqX) is however still "stuck". While the source RSS feeds are all visible when using "view output" in those feed blocks, the moment they are aggregated together in a "combine" block, things break leading to no output in the "combine" block and therefore a server error on the output feed itself.

I tried to re-do the "combine" block with two of the feed sources, yet again I get no output in the "combine" block. Is my pipe potentially not yet migrated to your new setup, or do I just have to wait for a bit until everything is being refreshed?

Thanks so much for your help, truly appreciated!

onli commented 5 months ago

Initial guess: Either a ressource limit that got introduced with the change by accident, or simply the pipe timeout that was moved around a bit. I'll have a second look soon.

baloo-gh commented 5 months ago

Thanks already now!!!

baloo-gh commented 5 months ago

I think I have isolated why my other pipe does not work: When I remove the source feeds from one specific domain (https://www.n-tv.de/), then suddenly "combine" works again.

I then contrasted one of the "flawed" sources with one which works: https://www.n-tv.de/rss (not working in "combine") https://www.tagesschau.de/infoservices/alle-meldungen-100~rss2.xml (working in "combine") https://www.n-tv.de/rss (working in "combine")

The difference seems to be in the first line of the RSS feed. While Tagesschau and Taz have very simple first lines, N-TV has much more in there. Could it be possible that "combine" is not able to parse the RSS feed from N-TV like it could before the code rewrite?

onli commented 5 months ago

Could it be possible that "combine" is not able to parse the RSS feed from N-TV like it could before the code rewrite?

Yes, I think that is what happened. Though the combine block is not parsing the incoming feed directly anymore (only the feed block does that now), when transfering the items the combine block stumbled over the enclosure in one of those feeds. It was expecting a description.

The pipe seems to work now again. Thanks for the good test cases :)

baloo-gh commented 5 months ago

Wonderful, I can confirm that everything works again smoothly. Thanks so much for your fast help and as I just recently stumbled over your blog: Vielen herzlichen Dank!!!!

onli commented 5 months ago

Sehr gerne ;)