getpelican / pelican

Static site generator that supports Markdown and reST syntax. Powered by Python.
https://getpelican.com
GNU Affero General Public License v3.0
12.57k stars 1.81k forks source link

Atom feeds use relative path for src attribute value in img tags. #812

Closed kickingvegas closed 4 years ago

kickingvegas commented 11 years ago

Problem

Page content that use relative paths for the src attribute in img tags are unchanged when rendered into an Atom feed. The src value should be replaced with an absolute path. This behavior probably also is seen with RSS feeds (haven't tested).

Fix Request

Ideally the absolute path would be defaulted to be prefixed with SITEURL and can be overridden with another variable (e.g. a path pointing to CDN).

Perhaps FEED_ASSET_URL?

justinmayer commented 11 years ago

Hey Charles. Saw the discussion in IRC that prompted this. Have you defined SITEURL with something like http://example.com ? Have you explicitly set RELATIVE_URLS to False?

For your relative source paths, have you tried /static/images/2013_03_12/DigitalRadioLogistics.png instead of static/images/2013_03_12/DigitalRadioLogistics.png ?

I have a feeling that some combination of the above will address this problem. If it doesn't, try the new syntax for linking to internal content.

Can you explain difference between FEED_ATOM and FEED_ALL_ATOM?

I believe the contributor's thought was the latter would include all translations, while the former would only include items written in the default language.

kickingvegas commented 11 years ago

Setting RELATIVE_URLS to False makes no difference regardless of whether a '/' is placed in front of the path.

The new |filename| syntax does work, however it is problematic in that I would have to modify every img tag in my content to support it.

Also thinking about it, it would be better if static/ wasn't hard coded into the path when authoring content. That way, one could from the content file (in this case Markdown) do a preview and have the images in the content display without requiring Pelican to render it.

justinmayer commented 11 years ago

Could you test and see whether the behavior you describe above is still present in current master? I seem to recall that the static/ hard-coding was removed recently.

Edit: The latter is due to be addressed as part of #795.

kickingvegas commented 11 years ago

Just checked out master and the fix for static is not in.

glyph commented 10 years ago

I just hit this too.

As a result, my first post mentioning Pelican showed up on an aggregator with the broken image link. A little embarrassing for Pelican, I think :-).

I'll re-generate whenever an update is available, though.

justinmayer commented 10 years ago

Probably best to use absolute URL links until a relevant enhancement has landed.

glyph commented 10 years ago

Absolute URL links makes it impossible to draft posts locally, though.

justinmayer commented 10 years ago

Quite true. If someone would like to step forward and modify the behavior such that relative URLs are converted into absolute URLs only for feeds, I imagine that would solve this problem.

glyph commented 10 years ago

Well, to be clear, I have explicitly set RELATIVE_URLS to False when I generate my site for deployment; so I would expect I wouldn't need to type an absolute URL.

justinmayer commented 10 years ago

The RELATIVE_URLS setting is document-relative. When I said "relative URLs" above, I was referring to root-relative URLs, which is what you get by default (i.e., if RELATIVE_URLS is False and SITEURL is undefined).

As I said above, I agree that the user shouldn't need to type an absolute URL — as long as SITEURL is available, I can't think of any reason someone couldn't submit a PR that tells Pelican to generate absolute URL links in feeds.

glyph commented 10 years ago

Is there any reason that this should be specific to just feeds? If I specify SITEURL and RELATIVE_URLS=False, my initial assumption was that I'd get absolute URLs everywhere.

justinmayer commented 10 years ago

Sorry for taking so long to respond, @glyph. Yours is a reasonable assumption. Some folks have expressed a desire for root-relative URLs for intrasite links (in production as well as development). In an ideal world, it might be nice if RELATIVE_URLS could be set to root, document, or None.

glyph commented 10 years ago

No problem, open source, volunteer effort and all that :-). Perhaps I'll have to make the time to dive into Pelican myself…

To be clear though it doesn't sound like there's any opposition to implementing this now, though.

justinmayer commented 10 years ago

No opposition. It would be quite welcome, actually. As would be your diving into Pelican, by the way! (^_^)

andyli commented 9 years ago

I am also facing this problem... I wonder if there is any quick workaround before someone actually fix it?

leotrs commented 8 years ago

@glyph did this ever get done?

glyph commented 8 years ago

@leotrs - which, regenerating my site's feeds? yes, I do so pretty regularly…

leotrs commented 8 years ago

@justinmayer did @andyli's PR fix this?

justinmayer commented 8 years ago

@leotrs: I haven't used @andyli's plugin, so I don't know whether it serves as a workaround or not.

justinmayer commented 4 years ago

I would like to revisit this issue and hopefully put it to bed. First, it seems to me that the most expedient solution is to use the {static}path/to/file feature that was added after this issue was originally filed, which in my testing produces absolute URL links in the generated feeds.

But that doesn't address what @kickingvegas said about this kind of solution:

I would have to modify every img tag in my content to support it.

Another approach could be to look for relative src="/…" links inside article/page content and replace those relative links with absolute links. So before handing off the content to the feed generator, we could do something like this:

# Replace src="/path/to/file" with src="https://domain.tld/path/to/file"
regex = re.compile(r"""src=["|'](/.*)["|']""")
content = re.sub(regex, 'src="{}{}"'.format(FEED_DOMAIN, r"\1"), content)

This is obviously a simplistic implementation and would need to be improved. For example, if someone writes an article about relative URLs and includes some in a code snippet, well... those relative URLs will also be summarily replaced.

Also, I have no idea whether there would be performance implications.

What do you think about this? Please post your thoughts here so we can wrap this one up. (cc: @getpelican/reviewers)

justinmayer commented 4 years ago

After some discussion about this topic, the consensus is that the aforementioned approach would likely be unacceptably brittle. There is already link syntax that produces proper absolute links in feeds, and it shouldn't be too difficult to script mass-replacement of img tags for those migrating to Pelican. For those reasons, I think this issue has been resolved.