Open Swimburger opened 4 years ago
Thanks for the suggestion! This is certainly something for us to look into. I feel like allowing the SWA routing to route based on crawler (user agent) wouldn't be too bad and would unblock this scenario.
There are some changes in progress for making our routing more robust and add a number of new scenarios, I'll add this to our backlog for exploration!
@anthonychu Hi. Any plans for implementing this feature? Is there a workaround how to redirect crawler to my server for server side rendering?
@Taras-Tyrsa @Swimburger If we were to allow user agent based routing, what would identify crawlers?
@anthonychu There are documented lists of user agents for Bing, Google, Yahoo etc, but I believe that's not something Azure team should maintain and keep in sync. Moreover, crawlers are not the only use case. The best would be to allow developers to configure redirects based on user agent (regexp?), client IP/range, request url, headers, etc somewhere in Azure portal or better even via some config file in deployment.
My use case is focused on SEO, so whatever would help me achieve Google's JavaScript SEO best practices described here: https://developers.google.com/search/docs/guides/dynamic-rendering
They are using User Agents to detect crawlers, so we should at least be able to do that.
SEO & I also see opengraph related url sharing need prerender.
I believe Netlify is the main comparison: https://docs.netlify.com/site-deploys/post-processing/prerendering/#app
If Azure were to pull it off, you would be way ahead of Amplify (opened in 2019): https://github.com/aws-amplify/amplify-console/issues/91
This will be a deal breaker if Azure static web apps can support pre-render for crawlers. Most of the client rendered JavaScript apps written in React, Angular, Vue and etc. have the same issue with SEO/crawlers/open graph. I have seen various hacks, custom redirects and etc. to overcome this but nothing solid out there.
Frameworks like NextJs still need server-side rendering and some support with pre-render pages at build time or ISR. Still, in most cases, you don't want to encapsulate/rewrite your app in third-party frameworks for this. Also not the best option if your site has a lot of dynamic rendering components and if you want to host your site as a serverless static site.
Would be great if Azure static web apps can offer this feature. 👍
Sounds like there are some different requests here. Let me list them out and let me know what I've missed.
_escaped_fragment_
is passed in the query, do something else (call a function?). As pointed out in Netlify's docs, this is being deprecated by Google.@anthonychu For the first bullet point (and maybe second), Google is probably less of an issue. I have seen a need for prerender when opengraph images, titles, etc are based on an api call and facebook, etc does not give it a chance to load and fails to show the image, etc. when a url is shared. I'm just looking at azure recently, so I apologize if I have missed out on some possibilities. If it is possible to use something like: https://docs.prerender.io/article/12-middlewares with azure static web apps, that will likely work for me, but ideally it would be nice to simply press a button to enable prerender and forget about it.
This may be helpful: https://github.com/netlify/prerender to see what netlify is doing.
@anthonychu
- If a request is from a crawler (specific user agents), do something else (call a function?)
Guessing this will be a server-side redirect similar to staticwebapp.config.json redirects? Perhaps an option in the routes section to specify if crawlers. Client redirects will get treated as Cloaking which is why this would be cool if supported.
@anthonychu
Is it possible to configure a rewrite similar to this in staticwebapp.config.json
to use prerender.io?
https://www.formition.com/blog/prerender-with-azure
exclude images, etc matches either the user agent or _escapedfragment and send a match over to https://service.prerender.io
For single page apps, I imagine this would happen under navigationFallback
. The docs show we can use exclude
, but can we filter on user agent or query string params, pass the token header and create the URL similar to https://service.prerender.io/https://{HTTP_HOST}{REQUEST_URI}
For reference: https://prerender.io/how-to-install-prerender/
Hmm 🤔 , or perhaps it would be better to do something along the lines of (though I'm looking at nodejs): https://anthonychu.ca/post/azure-functions-serve-html/
mix it with navigationFallback
and either return the normal single page app index.html or get the prerender from prerender.io. How much would this affect response time?
I support this feature request. It would be great to allow prerendering on Azure static Web apps.
This would for example allow one to host a blazor webassembly app on Azure static Web apps, while still a having the prerendering enabled so that one doesn't have to use ASP.NET Core hosting to achieve the same functionality.
- If a request is from a crawler (specific user agents), do something else (call a function?)
Even if it is just to update page <title>
(title is now possible in .Net6 I believe) and <meta name="description">
that would be such a great leap already.
There are hacks around to make this work, some great blog posts around, but I feel these are not bulletproof to suit all cases. An official prerendering support by Azure Static Web App would be fantastic.
@anthonychu When calling an api function from a route in staticwebapp.config.json
which will return html, is it possible to get the original response headers that would have been sent if it had not called an api function and then modify them as needed for the api function html response? I'm interested in nodejs
api functions in particular.
@smakinson I noticed you utilize a prerender function on #519 here https://github.com/Azure/static-web-apps/issues/519#issuecomment-976510735. Are you able to share what you did for the function? I am trying to use prerender as well.
Thanks in advance.
@jonlighthill I was planning to adjust the prerender express middleware for azure functions (https://github.com/prerender/prerender-node) and run all static web app requests through a function that decided if it should use the prerender or not. But what I ended up doing instead is using a cloudflare worker: https://github.com/prerender/prerender-cloudflare-worker with no change needed on the Azure end.
@smakinson Thanks for the response. @anthonychu any progress/updates on this feature?
Since this has already been achieved by IIS ARR https://docs.prerender.io/docs/18-integrate-with-iis, is it possible to achieve something similar just within re-write rules in staticwebapp.config.json instead of having to roll separate middleware?
I would be very excited as well, if this was possible. @anthonychu Any update on how far you are with this feature? I'd love to implement it on my website.
I think the ability to rewrite the url based on HTTP_USER_AGENT or QUERY_STRING is all that's necessary to achieve these requirements.
Right now you could rewrite the URL to bounce to a function, test the HTTP_USER_AGENT for botness, and handle the prerender scenario and/or return open graph meta tags. But if it's not a bot, you'd have to client redirect to another route that skips the function check. Pretty sure this is considered cloaking, and in some cases they redirect isn't honored.
Ideally you would only rewrite the route if it's a bot.
Edit: As a temporary workaround, you could redirect to Azure Function, test for botness and then handle your prerender/head-meta and then use something like YARP to suck in the static site. I guess in the case of static JS sites, you would only need to respond with a modified index.html so no real need for a full reverse proxy system. You can just HTTPClient the file, or grab it with blob API.
This is still badly needed, I am setting up a new site and dealing with a lot of deficiencies in trying to get SEO running properly on a Blazor SPA hosted through azure static webapps. Google says it runs 2 bot passes, one thats a raw html indexing and another later and more infrequent that loads and executes javascript, however, I have yet to see it pick up my homepage correctly. Additionally, unless I am not finding the latest documentation, Bing has the same issue as stated here it does not always run javascript like a browser against a webpage https://blogs.bing.com/webmaster/october-2018/bingbot-Series-JavaScript,-Dynamic-Rendering,-and-Cloaking-Oh-My
Agree completely with other comments here that something should be added to staticwebapp.config.json that allows for a botNavigationFallback or userAgentNavigationFallback setting to set an alternative root path to serve up files from based on user agent. This way the Azure Static Webapp could serve up alternative pre-rendered files based on the route names. Something like this would be really slick:
{ "navigationFallback": { // main spa page for human users "rewrite": "/index.html" }, "userAgentNavigationFallback": { // list of user agents to re-route "targetUserAgents": ["GoogleBot", "Bingbot", "DuckDuckBot"], // path to pre-rendered html files. "prerender_path": "/pre/" // tells the site to load from an html file with the same name (ex: /about would serve wwwroot/pre/about.html) "isRouteStaticHtml": "true" } }
I'm sure there is a more elegant way to do this, just putting these pieces in here to illustrate the issue/need. Would love to be involved in a beta program and try this feature out or give more input if you guys are seriously looking into use case requirements for this feature.
Is there any update on this? Specifically if static web app route can detect bots?
Is there any update on this? Specifically if static web app route can detect bots?
Hasn't been an update for like 2 years. Where the stale bot at?
Yup. This is still difficult to do. Sometimes I stand up an app service just to handle opengraph tags, then rewrite the index.html. So they hit my app service iis just to load a spa index hosted on azure swa.
But OG is now more popular, what with Teams, sms chats, any number of apps trying to render a nice looking preview of a posted url
Thanks for the feedback everyone. Based on this documentation https://developers.google.com/search/docs/crawling-indexing/javascript/dynamic-rendering, prerendering based on crawlers seems to no longer be the recommendation, and instead opting for a more long-term solution such as using server-side/static rendering is the recommended path. Using a framework like Nuxt/Next/SvelteKit provides this functionality which is more in line with what you are looking for, and these can be hosted on Static Web Apps.
We are still noting the feedback, and we may look into providing a more managed offering routing requests through some type of compute for prerendering before hitting Static Web Apps when we have more distributed compute options available.
Thomas, I'm fine closing this request because it has "prerender" in it's title, but this request was necessary for dynamic-rendering (server side) as well. We're currently limited on how with a SWA we can intercept the HTTP index.html GET, engage our backend API and return modified HTML. This is necessary to perform server-side dynamic rendering.
In order for my SWA URL to be richly shareable on social media, I have to wrap the React app in HTML with the appropriate meta tags. I currently have no way to do this, w/o standing up a full app service and handling it there.
Am I missing an easy way to rewrite a SWA React app which dynamically output OG metatag in the raw index.html based on the request? And if I could only do that for 'bot' request, even better, allowing the most used path to be the full SWA path, and the less used path the one with the dynamic OG tags.
+1 we need a dynamic rendering probably an option under staticwebapp.config.json
Many static web apps (JS, Blazor WASM, etc.) require pre-rendering to be more SEO friendly. Google crawler specifically handles JS apps quite well, but the crawler blocks DLL's so Blazor WASM isn't loaded properly. Many other search engine crawlers don't execute the JavaScript at all.
There are ways to serve pre-rendered content when a crawler is detected, but this requires a backend or webserver supporting rewriting the request. It would be a powerful selling point for Azure Static Web Apps to natively support pre-rendering of pages and serving them to crawlers.
The Chrome org has this project called Rendertron which renders webpages including executing JavaScript (works with Blazor WASM) and returns a static rendered version (without JS). Integrating something like that or similar tech could add a lot of value IMO.
Alternatively, it would be great to extend the SWA routing system to allow rewriting the request for crawlers.