Closed yeldarby closed 10 months ago
Yyyyeah. I can replicate this. Huh. Let me dig into this a bit and see what I can learn
Okay, I've filed an internal bug for this (b/208311208) and am looking more.
Can you confirm for me: I can replicate the issue by calling the Cloud Function directly, like:
curl -i https://us-central1-<project>.cloudfunctions.net/allRoutes/robots.txt
Do you see the same behavior while calling the Cloud Function directly too?
Hey @yeldarby. We need more information to resolve this issue but there hasn't been an update in 7 weekdays. I'm marking the issue as stale and if there are no new updates in the next 3 days I will close it automatically.
If you have more information that will help us get to the bottom of this, just add a comment!
Confirmed
HTTP/2 404
etag: W/"0-2jmj7l5rSw0yVb/vlWAYkK/YBwk"
function-execution-id: y6tgyxobnbsu
x-cloud-trace-context: fc5f8f389e7a020aa55227c8bb65217e;o=1
date: Wed, 08 Dec 2021 09:41:48 GMT
content-type: text/html
server: Google Frontend
content-length: 0
Hi, I have the same issue.
Don't know if it can help, but I'm sharing my experience.
With the following hosting configuration, the cloud function was not invoked at all:
"rewrites": [
{
"source": "/robots.txt",
"function": "robots"
}
]
Looking into response headers, I noticed a cache hit. So, I added:
"headers": [
{
"source": "/robots.txt",
"headers": [
{
"key": "Cache-Control",
"value": "no-cache"
}
]
}
]
This way, the function started to be called, but logs said only:
None of my custom log entries showed up, as if the code of my function wasn't actually triggered.
Sorry for not following up, but here's what I found out: the functions framework that wraps user code in GCF purposefully stops /robots.txt
and /favicon.ico
from being responded to by GCF. We're working internally to see if that can be changed, but in the meantime I'd suggest either (a) creating a static file /robots.txt
to serve the correct content, or (b) migrating to Cloud Run, which doesn't have the same limitation AFAIK (though I do acknowledge that that is a bit more work).
If I get more of an update, I'll try to follow up again. Thanks for raising!
Hi, @bkendall. We're creating a robots.txt
API to create the file dynamically but facing this same issue. So I'm wondering if there's an update on it.
No update as of today, sorry. The really short version of the situation is that GCF didn't design their product with "serve all HTTP requests" in mind - they tend towards specific event providers, even with Firebase's frequent use of HTTP. Their framework explicitly stops robots.txt
and favicon.ico
from being served, and that's unlikely to change. You may actually have better luck raising an issue in that repo so it can help show user need for it!
Since this isn't something that we can fix in the CLI though, I'm tempted to close this issue. Maybe I'll make a change in the Hosting emulator that will fail or at least print a warning on these paths... at least then it's not a surprise when it doesn't work on production.
Any update? still facing the same issue :\ What is even strange, I'm using nextjs and locally it works (without using rewrites but robots.txt.ts ssr file inside pages folder) BUT not working in production. Anyway, rewrites still not working.
Unfortunately, I don't think this is going to be able to change. If you're using Cloud Functions (i.e. you're using the firebase-functions SDK, and either gen 1 or gen 2), you're going to be using the GCP framework that prevents those routes from being served.
The workaround mentioned before of using a static file still apply (since that content is resolved before rewrites), but I don't think we're going to be able to solve this problem here via the CLI.
If these two files being dynamically served via Functions is critical to your workflow, please let us know more by contacting support with a feature request.
This is quite bizarre behavior not to have documented for robots.txt.
I'd propose for this to be re-opened and the emulator to warn when a 404 is thrown for these files, specifically linking to documentation for it.
I spent a full day trying debug this one. This should be prioritized now with the push for Firebase Webframeworks.
Any new progress?
I spent a full day trying debug this one. This should be prioritized now with the push for Firebase Webframeworks.
Totally! I'm not sure why this issue is closed? It should, at least, be documented somewhere!
I am having this same problem because I need to preventing crawling for my dev and staging environment, but not prod. I've decided to go with a moving a robots.dev.txt and a robots.prod.txt file into my build directory during the build phase. Of course this meant having to split my build into 2 different builds, yarn build:dev and yarn build:prod, but that was the only way I could come up with.
I hope you guys allow robots.txt to be redirected to a function in the future.
I spent a full day trying debug this one. This should be prioritized now with the push for Firebase Webframeworks.
Totally! I'm not sure why this issue is closed? It should, at least, be documented somewhere!
I thought I was going going crazy... everything else worked except /robots.txt
...
[REQUIRED] Environment info
firebase-tools: 9.17.0
Platform: macOS 11.5 (20G71)
[REQUIRED] Test case
It's not possible to use a Firebase Function to dynamically generate your
robots.txt
file; it always 404's. We're trying to dynamically insert the properSitemap:
directive into ourrobots.txt
file based on some environment variables but the function will not run for this route.I've created a basic repo to demonstrate the problem here.
This repo rewrites every route (except
/
which uses the staticindex.html
as expected) to a function calledallRoutes
that simply prints the route path. It works for every path I've tried except forrobots.txt
which 404's without hitting the function.[REQUIRED] Steps to reproduce
Clone the test repo and deploy to Firebase. Try to navigate to
robots.txt
; it will 404.I've deployed the repo here for your convenience:
[REQUIRED] Expected behavior
The
robots.txt
route should run theallRoutes
function and print{"hello":"from index.js","path":"/robots.txt"}
as it does locally on the emulator.[REQUIRED] Actual behavior
robots.txt
404's and the function is not run.