dotandimet / Mojo-Feed

Mojo::DOM based Feed Parser
Other
4 stars 1 forks source link

Lazy loading of attributes #1

Closed mdom closed 6 years ago

mdom commented 6 years ago

Hi! This is just a rough draft to see if you like the direction I'm working on? I had to rewrite a lot, but all test but t/03-feed-find.t are passing. No idea, what i broke there, still trying to find that error.

Cheers, Mario

mdom commented 6 years ago

Ah, okay, i cached the dom and text attributes of Mojo::Feed and didn't notice that you use one Mojo::Feed object to parse multiple feeds. I reverted that change back and prevent caching of all feed related attributes in Mojo::Feed. What do you think about returning new Mojo::Feed instances from parse?

dotandimet commented 6 years ago

Hey Mario, I really like the idea of lazy attributes for everything. I think that Mojo::Feed is a bit too complex right now, with the multiple entry points to parsing and possible states. Probably this is a compromise because I just kept the old tests from Mojolicious::Plugin::FeedReader. It might make sense to return new objects from parse(), or maybe to remove parse() altogether, and just use new?

mdom commented 6 years ago

I agree that it's maybe a little bit too complex, but i'm not really sure how to address that. Maybe it would be easier to seperate it into two packages: one to fetch and discover feeds and one that is actually the feed? Although i don't know how to make that work with the new name (we could name the base class for feeds Mojo::Feed::Feed... :smiley:)

Maybe it's easier to have:

$feedr = Mojo::FeedReader->new(ua => $ua);
$url = $feedr->discover('http://example.org');
$feed = $feedr->parse($url);
say $feed->title

Or maybe it's easier if parsing and creating are really seperate operations?

$feedr = Mojo::Feed->new(ua => $ua);
$url = $feedr->discover('http://example.org');
$feed = $feedr->parse($url);
say $feed->title

I'm not really sure. Without seperating this functionality I imagine it would be weird to call discover and you would have to use ua a lot in new to prevent creating many user agents. Does that make sense? It's really early here... :sleepy:

dotandimet commented 6 years ago

If we move the fetching and parsing part into its own module, we could just call it Mojo::Feed::Reader :smiley: I'm inclined to go with that (extracting the fetching part). Most of the complexity I think is borrowed from XML::Atom::Feed, which tries to give a very friendly interface and hides considerable complexity inside.

dotandimet commented 6 years ago

Sorry, I meant the _at(_with_namespaces) method, seen in the commit with the message Handle multiple matches with namespaces - I presume that's fixing something you ran into which is unrelated to the lazy loading feature. Is this something you ran into with a podcast feed? I'd really like to have a test for it that fails in my branch so I can see what it changes.

BTW, I found podite, which looks very cool! I felt a sense of Deja Vu seeing the URLQueue class - I've had messier and more primitive versions of something similar in my own project. My last attempt is Mojo::UserAgent::Role::Queued, which tries to stick the queue inside the UserAgent. I'd love to know what you think (I'd be happy even with someone telling me it's a dumb idea).

dotandimet commented 6 years ago

Anyway, is it OK if I merge this pull request? Let's split the part that uses a Mojo::UserAgent into a Mojo::Feed::Reader class; then Mojo::Feed and Mojo::Feed::Item can be pure parsing classes with lazy attributes.

mdom commented 6 years ago

On Sun, Mar 18, 2018 at 01:34:26PM -0700, Dotan Dimet wrote:

Sorry, I meant the _at(_with_namespaces) method, seen in the commit with the message Handle multiple matches with namespaces

  • I presume that's fixing something you ran into which is unrelated to the lazy loading feature. Is this something you ran into with a podcast feed? I'd really like to have a test for it that fails in my branch so I can see what it changes.

No, this is just some refactoring i did on the side. Sorry, i should have really made smaller commits and explain them better. The real commit where i introduces at_with_namespace was in https://github.com/dotandimet/Mojo-Feed/pull/1/commits/c301a2d8dbe77d7986ef8749cdee33f1e65e3a82#diff-9c61acd8328d7ab1342e1a5dba8c1d64R26

It just checks generally if the complete tag matches with the one we searched for and does not have a hardcoded list of namespaced tags we're interested in. I haven't checked if there is already a test to see if itunes:summary is picked up, otherwise i will add one.

BTW, I found podite, which looks very cool!

Thanks!

I felt a sense of Deja Vu seeing the URLQueue class - I've had messier and more primitive versions of something similar in my own project. My last attempt is Mojo::UserAgent::Role::Queued, which tries to stick the queue inside the UserAgent. I'd love to know what you think (I'd be happy even with someone telling me it's a dumb idea).

No, this is a really good idea to have a user agent with queue support on cpan. I just stole my version from jberger. I'll try your version the next time i need a queue! For podite I'll probably need a bit more control over the process and try things like spreading the downloads over multiple domains if possible and experiment with the api. But maybe i can use your promise api. Let's see... :)

Cheers, Mario

mdom commented 6 years ago

On Sun, Mar 18, 2018 at 02:20:33PM -0700, Dotan Dimet wrote:

Anyway, is it OK if I merge this pull request? Let's split the part that uses a Mojo::UserAgent into a Mojo::Feed::Reader class; then Mojo::Feed and Mojo::Feed::Item can be pure parsing classes with lazy attributes.

Sure, great! I just pushed my last changes, this time with a longer commit message. Should i take a stab at making everything lazy? I don't want to drown you in pull requests ... :/

Cheers, Mario