forlater-email / navani

forlater's primary mail processing service
GNU Affero General Public License v3.0
30 stars 1 forks source link

multiple urls sometimes dropped if one of them is bad. #3

Open ploum opened 2 years ago

ploum commented 2 years ago

It seems that, in a few cases, when an email contain multiple URLs, processing stop and only some of the URLs are sent. Hard to reproduce but I’ve been able to reproduce it with this bunch of 4 URLs. (the 2nd might be problematic, only the first one is successuffy sent).

https://useplaintext.email/ https://drewdevault.com/blog/index.xml#fnref:2 http://unspicilege.org/index.php?post/L-humain-outresolaire-en-affiches http://www.collaborativefund.com/blog/how-this-all-happened/

Also with this packet :

https://live.staticflickr.com/65535/51696787662_1e31625d7c_b.jpg https://blogs.mediapart.fr/jean-marc-b/blog/251118/l-ideologie-sociale-de-la-bagnole-par-andre-gorz https://www.theguardian.com/environment/bike-blog/2021/oct/29/the-bikelash-paradox-how-cycle-lanes-enrage-some-but-win-votes

Nothing is received at all. Maybe because the first URL was not parsable.

My theory is that navani stop processing the message as soon as it encounters an URLs for which it cannot send a good content.

If my theory is good, a fix would include:

  1. Parsing subsequent URLs, regardless of the previous one.
  2. Always sending an email about a received URL, with a message "unable to find content" (that way, at least, you are informed)
icyphox commented 2 years ago

Hey, sorry for the late response -- I was on vacation. I'll take a deeper look at this soon as I'm swamped with work IRL.