KV and workers.dev Route Errors

ampproject / cloudflare-amp-optimizer

Implementation of AMP Optimizer for Cloudflare Workers

Apache License 2.0

23 stars 5 forks source link

KV and workers.dev Route Errors #28

Open TugT4G opened 3 years ago

TugT4G commented 3 years ago

Hi,

I have set up and published my worker to Cloudflare but there seems to be one issue I am having and that is with the KV Namespace. Below is the exact code I put in my wrangler.toml file and the KV cache is enabled via the config.json file:

[env.prod]
name = "the name PowerShell gave when creating the KV namespace"
route = "https://www.examplesite.com/*"
vars = { MODE = "prod" }
kv_namespaces = [
         { binding = "KV", id = "my-id" }
]

I created it using the wrangler kv:namespace create "KV" command.

I did not place a preview_id because I wasn't sure how to get one so I left it out. I only have the env.prod code for KV in the wrangler.toml file.

When viewing the worker via the console I get this error:

Uncaught Error

I am not exactly sure as to why the KV would not be defined?

Also, the workers.dev route is active but leads to an Error 523 page. Seems there is a hosting error. I am thinking I turn that off and go straight through the additional route I added instead?

Thank you in advance for any help I receive!

Thanks and regards,

Tug

katsar0v commented 3 years ago

I also get the same error even though I did everything as described:

TugT4G commented 3 years ago

With the Official AMP Plugin installed and Reader Mode selected to choose a theme that's AMP compatible, the worker will automatically find the ?amp=1 suffix since we cannot use query strings for the routes, correct? I am wondering if that has anything to do with anything?

samouri commented 3 years ago

@TugT4G: the KV not bound error is more than likely because you are only creating the KV binding for the prod environment. You have to ensure KV is bound for every environment.

See a working example here. In the example, notice that the KV binding is repeated in each environment.

TugT4G commented 3 years ago

@samouri thank you for that information. I am just confused about how you got the same KV binding for each environment?

Do I have to create an environment along with a variable through a Wrangler command or just specify the environment and give it the prod, dev, and beta variable in the wrangler.toml file, then, when it's published, it will submit that information automatically?
Whenever I create a KV Namespace it automatically puts the name I created as the binding. Would I just change each namespace's binding to KV and it will be okay? For example, if I create a namespace, "dev", for the dev environment:
```
[env.dev]
name = "t4g-amp-**dev**"
vars = { MODE = "dev" }
kv_namespaces = [
     { binding = "dev", id = "MY_ID" },
]
```
- they keep the worker name and append the namespace name I created to the worker name then add that same name I created as the binding.

In that working example there is this code:

[env.prod]
name="optimizer-demo"
vars = { MODE = "prod" }
kv_namespaces = [  { binding = "KV", id = "4d45ebfb84fd498dbf024c6512e7dfc8", preview_id="86b8fd11b0b64de786d76b0ed82bd40b"},
]

The other two example names are "optimizer-demo-dev" and "optimizer-demo-beta" but the prod is just "optimizer-demo" instead of "optimizer-demo-prod". I assume the dev and beta via "optimizer-demo-dev" and "optimizer-demo-beta" are the created namespace? So in the prod namespace created you shouldn't add prod?

Maybe I am overthinking it a bit.

The answer I am thinking of is creating three namespaces and name one "dev", another "beta", and the last "prod". Then I would copy and paste the "kv_namespaces =[{...},]" code wrangler gives and change the binding from the KV Namespace name to KV. Then I would place each given kv_namespace code underneath each env code. Then I'll publish it with wrangler publish.

Is that the correct way to do it?

samouri commented 3 years ago

From what I can tell, the primary impact of name is the workers_dev URL that gets assigned when you publish. So if you want each environment to be published to a different workers_dev URL, then you should give each a unique name. Any system you come up with for creating names is valid. If the workers.dev url is not important to you or if you have it disabled, then I'm not sure what other impact there is.

Regarding KV Namespaces, I've never actually ran the wrangler CLI command for creating one and always populated it by hand. From what you've shown me, yes, the default binding will not work in that case. binding always needs to be KV to work with this repo.

The other two example names are "optimizer-demo-dev" and "optimizer-demo-beta" but the prod is just "optimizer-demo" instead of "optimizer-demo-prod". I assume the dev and beta via "optimizer-demo-dev" and "optimizer-demo-beta" are the created namespace? So in the prod namespace created you shouldn't add prod?

You might be overthinking this part 😬 . There was no specific intention here besides for that I liked how optimizer-demo.workers.dev looked better than optimizer-demo-prod.workers.dev

TugT4G commented 3 years ago

Okay, I believe I set up everything correctly, however, the errors are still there so I am a bit dubious. I just want to make sure I did not set it up incorrectly so if there are any errors they can be sniffed out.

I changed the bindings from the name of the namespace created to KV. Here is my wrangler.toml file: wrangler file

I set up the preview ID for each namespace with the wrangler kv:namespace create "prod" --preview command, which is the same if I did it by hand via the Cloudflare Dashboard>Workers>Manage Workers>KV and put prod_preview for the namespace as shown in the photo below: KV Namespaces

Here is my config.json file: config file

I did not add anything to the Environment Variables or the KV Namespace Bindings as it is recommended to be placed in the wrangler.toml file if using wrangler, but everything I have listed so far in the wrangler file should cover it when it's published.

The worker data recorded shows significant errors that keep increasing; not sure exactly what that could be about:

Thank you for the help you have given!

TugT4G commented 3 years ago

I also want to mention that I just started from scratch. I deleted that worker and started a new worker following the given instructions using the npx @cloudflare/wrangler generate my-worker https://github.com/ampproject/cloudflare-amp-optimizer command. I did not change the "my-worker" name.

I published it using the npm run prod # calls wrangler publish --env=prod command given here.

The results are still the same in regards to the errors and KV is not defined. I do get a 200 OK though.

When I published it, PowerShell said there are 3 moderate severity vulnerabilities and 10 packages are looking for funding.

I ran the npm audit command and I took a screenshot and copied the results of the finished audit. I can fix it by running the command npm audit fix --force so I did and the results of the errors and KV is not defined never changed. If you would like me to post the screenshot or paste in what I have copied then let me know; I am just not sure if there is sensitive information or not.

I hope the information I have given is of help!

samouri commented 3 years ago

Tug,

You should be able to run wrangler tail to see the actual errors.
Do you have a public repo you can point me to? Being able to reproduce the errors on my end would significantly help with debugging.

The results are still the same in regards to the errors and KV is not defined. I do get a 200 OK though.

I followed the instructions as well, but on my end all the KVs worked. In the instructions it says to "specify the bindings within the wrangler.toml file. The binding must be named KV". It is relatively vague, as instructions on configuring KV stores are much better on the actual CloudFlare Docs. I'll try to update the README.md with a link to the docs and also explicitly point out this specific pitfall (needing to specify the bindings in each environment).

Thank you for bringing the npm audit to my attention. I'll fix that asap by updating the necessary dependencies!

Hope this has been helpful, Jake

TugT4G commented 3 years ago

Awesome, I will do that.
Here is my public repo - https://github.com/TugT4G/t4g-amp-worker

I followed the instructions as well, but I overlooked a command via Wrangler CLI Commands - wrangler kv:namespace create "KV" --env=prod.

I was able to create each namespace under KV for each environment. Then I added the --preview command to each command - wrangler kv:namespace create "KV" --env=prod --preview.

The instructions PowerShell gave afterward me was:

Add the following to your configuration file under [env.prod]:
kv_namespaces = [
         { binding = "KV", preview_id = "MY_ID" }
]

Add the following to your configuration file under [env.prod]:
kv_namespaces = [
         { binding = "KV", id = "MY_ID" }
]

After doing so, I published the new worker with npm run prod # calls wrangler publish --env=prod .

However, I am still getting those KV not defined errors as well as the other errors I provided images for.

Here is my public repo of this project - https://github.com/TugT4G/another-test-worker

Did you want me to add the worker, or dist folders to my repo as well?

Thank you for your help and let me know if there is anything else you need that I left out.

TugT4G commented 3 years ago

Not sure if it matters at all but I am using nvm for Windows and I am using node v1.1.7, nvm v14.17.1, npm v6.14.13 (npm version came with the nvm version), and wrangler version 1.17.0.

In regards to the KV Namespace, I only ran the commands wrangler kv:namespace create "KV" --env=prod, wrangler kv:namespace create "KV" --env=beta, and wrangler kv:namespace create "KV" --env=dev to get the ID, then I added the --preview command to the end of each of those a separate three times to get the preview ID.

So I created the worker with the command listed in this repo, modified the config.json file directly with no commands used as well as the wrangler.toml file with no commands besides that kv:namespace command then I published it with the same command this repo suggests.

Maybe there are things I didn't do and that is why my worker is feeling a bit ill? I hope this extra information helps!

TugT4G commented 3 years ago

Update: I published a Cloudflare AMP Optimizer worker with the KV Cache set to false and removed the environments and KV Namespaces in the wrangler.toml file and there have been 0 errors, 1.15k success, and over 2k requests. Not sure if that helps in any way?

Another Update: This is going to sound crazy but that worker I mentioned above completely logs me out of my wp-admin pages. I can't even log in; it just keeps refreshing the login page. I deactivate the route of the worker and I can log in again. I ran the test over and over again and it is indeed that worker. Should I open it up in a separate issue?

Another Another Update: I read over the Introducing environments docs and it states, "Note that you must provide a route/routes key for each environment if you are deploying to a custom domain." I am deploying the worker to our custom domain, so I added the route as shown in my recent repository. Also, I deactivate the workers.dev route since it's null. Doing this was not shown in the instructions of the repo but I want to test out different ways. However, nothing has changed, unfortunately. I ran the npm audit and fixed the two vulnerabilities with the commands given, then it stated that 10 packages need funding - showed only 8. Unfortunately, I am not sure how to fund the packages: npm Audit Update npm fund

samouri commented 3 years ago

Here is my public repo - https://github.com/TugT4G/t4g-amp-worker

That repo doesn't seem to still exist.

Here is my public repo of this project - https://github.com/TugT4G/another-test-worker

I cloned another-test-worker and changed the values within wrangler.toml to my own ACCOUNT_ID, ZONE_ID, and KV IDs. It all worked as expected. Were you seeing problems with that repo?

Another Update: This is going to sound crazy but that worker I mentioned above completely logs me out of my wp-admin pages. I can't even log in; it just keeps refreshing the login page. I deactivate the route of the worker and I can log in again. I ran the test over and over again and it is indeed that worker. Should I open it up in a separate issue?

That does sound...crazy. I have no idea how the two things could be related!

Another Another Update: I read over the Introducing environments docs and it states, "Note that you must provide a route/routes key for each environment if you are deploying to a custom domain." I am deploying the worker to our custom domain, so I added the route as shown in my recent repository.

I didn't know you needed to repeat the route for each configuration but it does make sense. In your linked repo you give the exact same value for each environment. I don't think that can work. You should either only have a single environment, or have a different route for each env.

I ran the npm audit and fixed the two vulnerabilities with the commands given, then it stated that 10 packages need funding - showed only 8. Unfortunately, I am not sure how to fund the packages

You aren't required to "fund" any packages to use them. Feel free to ignore this message.

TugT4G commented 3 years ago

Thank you for your help and for getting back to me! You are much appreciated.

Sorry about listing a 404 repo.

I cloned another-test-worker and changed the values within wrangler.toml to my own ACCOUNT_ID, ZONE_ID, and KV IDs. It all worked as expected. Were you seeing problems with that repo?

Yes, I was. The KV is not defined, Client Disconnected, and Script Threw Exception it said.

That does sound...crazy. I have no idea how the two things could be related!

It did something to WordPress's REST API. Every time I activated the worker, our plugins, such as Google Site Kit, would no longer work and stated there is an error in WP REST API. The WordPress Site Health tool also stated an error in the REST API. A few seconds after deactivation, the error would go away. I did this process two more times to confirm. Strange indeed.

I didn't know you needed to repeat the route for each configuration but it does make sense. In your linked repo you give the exact same value for each environment. I don't think that can work. You should either only have a single environment, or have a different route for each env.

Good to know. I went ahead and just used the prod environment pointing to our site and left out the other two environments and the issue of the KV is not defined, Client Disconnected, and Script Threw Exception is still there. So I added the other two environments back pointing to different URLs of our site and the issue did not go away, unfortunately.

Not sure what to do from here since everything is working on your end just fine and we have done the same steps. Are there any other requirements needed for Cloudfare that I may be missing?

We have paid for Workers Unbound and we do have Cloudflare. However, since our domain is a “Partial” or CNAME setup via a hosting partner, I wonder if that could be causing the issue with KV not defined, Client Disconnected, and Script Threw Exception? Could that be causing the issue? Would we need to set up our domain through Cloudflare and configure the DNS ourselves? Some requests are successful, though, so I don't know how much more that would help.

ADDITIONAL EDITS:

I do have one more question. I am using the Official AMP Plugin and turned on the Query Parameter (/?amp=1). That shouldn't matter as the worker should sniff out the AMP pages, correct? I can only use our home domain for the route as a result.

UPDATE: I ran some more tests and published several different workers with different additions, subtractions, and commands. This brings me to a few more questions.

I assume that I did everything correctly now. I am not sure if there are supposed to be metrics for the worker or not, and I don't know how to check if the worker is doing what it is supposed to be doing? The prod environment is now the environment that published, and the name of the worker is the same name as my prod environment. The name always used to be the actual worker name that was created with npx @cloudflare/wrangler generate **my-worker** https://github.com/ampproject/cloudflare-amp-optimizer instead.

Check out my new repo - https://github.com/TugT4G/t4g-new-worker-amp-prod. It's essentially the same but I created this and published it with the command wrangler publish --env prod. This Cloudflare AMP Optimizer repo essentially does the same command but does the npm run prod, which calls wrangler publish --env=prod. wrangler publish --env prod command actually published the prod environment and named the worker under the name listed in the prod environment, which has never happened before. It then added the Bindings and Environment variables in the worker>Settings:
There are no metrics available as it states there have been no recent requests, and I am not sure if there should be or not? I don't know how to check if the worker is even working. However, when choosing the worker then "Quick Edit", there are absolutely no errors and each page I request from my site gets a 200 OK. So the KV not defined is no longer there. Since the metrics are not populating the Client Disconnected, Script Threw Exception, and Success are not going to be in the report. Are there supposed to metrics?
The workers.dev route is disabled and if I add my site as an additional route, I can no longer log into my site. Even when I use WPMUDEV to bypass the login I get a 403 error in regards to a cookie. If I use my host's admin login to bypass the login process it still doesn't work, so I just got rid of the "Additional Route". When I did enable the Additional Route, however, the metrics were populating and no errors showed at all in the metrics - just "Success".

Did I finally do this correctly? Are my results of the prod worker, no metrics, and environments/bindings that populated in "Settings" as shown in the photo above the correct results? If so, I think there may be something up with the command, npm run prod # calls wrangler publish --env=prod. Or maybe there is something up with my particular setup?

Side note, when I ran the command wrangler publish --env prod, Wrangler barked at me stating I need a route set for that environment. I added the route and it published. Hopefully, I did this correctly.

Sorry for the novel, but I just want to make sure my process is the right one indeed.

TugT4G commented 3 years ago

Below is that error message I am getting when the "Additional route" points to our site. Our additional route is https://www.my-site.com/groups/*. The asterisk also leads to the /wp-admin/* pages and causes the error:

{"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}}.

I found some interesting information in regards to something similar - https://stackoverflow.com/questions/46222009/wp-api-retrieving-drafts-cookie-nonce-is-invalid.

When I change the route to something like https://www.my-site.com/groups/* then I can access my /wp-admin/* pages again. However, the top admin bar disappears on all of those associated URLs. Just thought this information could be useful!

Not sure how to cover all AMP pages using routes, particularly posts.

TugT4G commented 3 years ago

@samouri not sure if you are still looking into this issue or not but I have ran some more tests and have done researched on the {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}} and Invalid Nonce! errors when attempting to log in. I read about it here - What is: Nonce - as well as the WordPress documents and it seems the worker may be interfering with the key somehow.

I do want to note that we have made the switch to Cloudflare Pro and updated all of the DNS settings. Everything is configured and performing properly.

UPDATE: I created a subdomain and added the DNS records to Cloudflare. Then I activated the worker and routed it to my subdomain. I have had none of the following issues in terms of logging in or being logged out. There was no {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}} error or Invalid nonce! error and I could access my /wp-admin/* pages just fine.

If that's the case, what in the world could cause that {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}} error I mentioned with my domain?

Things I have Done:

I have deactivated all of my plugins
Cleared my browser's and site's cache and my browser's cookies
Prevented WP Rocket and Cloudflare from caching /wp-json/, /wp-login.php, /wp-admin/*, and /register/.
Set the Browser Cache TTL and Edge Cache TTL to 4 hours
Changed themes to the default WordPress theme
Removed my .htaccess file
I have deactivated all of my plugins, .htaccess file, and put Cloudflare in Development Mode at the same time
Finally, I disabled REST API and restricted it to authenticated users only

Nothing has made any difference, unfortunately.

My Questions:

How do I fix the {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}} errors?
Is that error on my end or the worker's end since the additional route is our domain, which covers everything?
Is there a way to restrict the worker from messing around with the WP REST API? Or why is it affecting that and how do I stop it so we can finally use the Cloudflare AMP Optimizer worker?
Is the worker compatible with WordPress?
Is there a conflict between the caching or optimization settings in Cloudflare and the worker?

The problem seems to lie with the worker interfering with WP REST API since I am using our home domain for the Additional route.

This is what I have done when publishing my worker:

METHOD 1: We followed all of the instructions listed in this repo. I used the command, npx @cloudflare/wrangler generate my-worker https://github.com/ampproject/cloudflare-amp-optimizer to create the worker using this repo as a reference. I enabled the KV Cache via the config.json file and created the environments and KV bindings as instructed with the command wrangler kv:namespace create "KV" --env=prod and wrangler kv:namespace create "KV" --env=prod --preview. This same process was done with the beta and dev environments. After the wrangler.toml and config.json files were configured per the instructions, I used the command npm run prod # calls wrangler publish --env=prod. You can see my example repo here - https://github.com/TugT4G/t4g-new-worker-amp-prod.

The weird thing is, that wrangler publish command I listed above, npm run prod # calls wrangler publish --env=prod, to publish my project doesn't seem to publish the prod environment at all as I cannot see any added Environment Variables or KV Namespace Bindings via Settings of the worker.

On top of that:

There are thousands of "Client Disconnected" and "Script Threw Exception" errors with very few Success requests.
Also, I see a KV is not defined error in the console.
The route I added gets added to the worker's "Additional routes" as expected. However, this prevents me from being able to access any of my /wp-admin/* pages, therefore, admins nor anyone else can log in to our site and everyone gets logged out. It simply redirects everyone back to the login page and when attempting to log in I receive an Invalid nonce! error message.
If I use my host or WPMUDEV to log into our admin account to WordPress, I receive this error: {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}}. I looked that up and it seems to possibly be a cache issue or a conflict with the WP REST API Key. This is a security measure by WordPress.
I also published the worker then add the Environment Variables and KV Namespace Bindings manually and the errors are gone but the {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}} error is still there.

METHOD 2: When I use the command wrangler publish --env=prod instead of npm run prod # calls wrangler publish --env=prod, however, it tells me, of course, I need a route to publish that environment, so I add the route and it publishes that environment along with the "Environment Variables" and "KV Namespace Bindings". I can now see the "Environment Variables" and "KV Namespace Bindings" in the "Settings" of the worker without having to add them manually.

However, my "Additional route" that is added to the worker still prevents all users from logging in and logs out all users. When this project is published, there are no KV is not defined errors or "Client Disconnected" and "Script Threw Exception" errors. It's all successful requests. I cannot keep the additional route enabled. The workers.dev route is null so that is kept deactivated.

samouri commented 3 years ago

@TugT4G: thank you for all of the information! It does sound like we probably have a bug related to clobbering headers or query params that you are running into when accessing the wp-admin. I'll investigate and get back to you.

You've also identified another bug! The npm run prod is missing the --env flag.

The KV Binding error is a standard CF Worker error signifying that the env you are publishing has incorrect configuration. I'm guessing this was related to the missing --env=prod in the npm run prod command.

Is the worker compatible with WordPress?

I thought so, but based on your findings it is likely we have a bug or two lurking. I'll create a WP site to test it out on.

TugT4G commented 3 years ago

@samouri Thank you for that! Glad to see I wasn't losing my mind lol.

I have a little more details on that cookie issue as well as something new I think may be beneficial for you to look into when you create your WordPress test site (AMP validation issue I ran into). There are also a few more things I would like to give some feedback on that you are probably well aware of but it could be further clarified in the readme:

I set the routes, /wp-admin/*, /wp-admin/, /wp-json/*, /wp-json/, /wp-includes/, /wp-includes/*, /wp-content/, /wp-content/*, /wp-login.php*, and /wp-register/ to none so the worker would bypass them. When I did that, the cookie error was gone. Anyone could log in. However, no one would know they’re logged in because the frontend looks as if you’re still logged out, i.e. you still see login and register with no profile links. This also pertains to admins; admins would no longer see the admin bar either in the front end (non /wp-admin/ pages). In order to get to the admin pages, you need to fill in /wp-admin in the URL.
I found that Wrangler will not publish the worker unless a route has been specified or the workers_dev = true. As it states in the Environment docs, "Wrangler will fail to publish to an environment where route is defined alongside workers_dev = true". Maybe touch on this in the readme?
There are AMP validation issues. When I had the worker activated to try and figure out the bugs, some of our AMP pages are no longer served by Google for mobile devices now. So I tested the AMP pages while the worker was activated and received these issues: .
I ran the wrangler tail --format="json" command to see the logs. I did not notice any errors but I also am inexperienced at this whole thing. I did notice a canceled message for one but everything looked fine.
The keys for the KV Namespace used when the worker has activated store the amp URL and Values fine. The AMP HTML all resides in the Values, which is how it is supposed to be I assume.
I feel like the worker may be somehow blocking some key functionalities of WordPress/themes and prevents our site from outputting needed information to the browser?

The KV Binding error is a standard CF Worker error signifying that the env you are publishing has incorrect configuration. I'm guessing this was related to the missing --env=prod in the npm run prod command.

Yep, when I manually added the Environment Variables and KV Namespace Bindings the KV is not defined error disappeared. When I published the worker with wrangler publish --env=prod, the name of the worker not only changed to the prod but it worked right out of the gate with no issues/errors via the console and metrics.

Other than that I have nothing further to report. Thank you for your help and for investigating this further!

TugT4G commented 3 years ago

I also just received some feedback from other developers in regards to the worker. They created a WordPress test site and configured the worker to their site. They also received that {"code":"rest_cookie_invalid_nonce","message":"Cookie nonce is invalid","data":{"status":403}} error.

They stated:

[...] suggests you excluding SSO endpoints from the worker. [...] While this is the full request (GET) from the HUB to open Access to your site via SSO, this should be the API endpoint: {"code":"rest_missing_callback_param","message":"Missing parameter(s): domain, hmac, token, pre_sso_state, email","data":{"status":400,"params":["domain","hmac","token","pre_sso_state","email"]}}

Hope that helps! If you need anything else just let me know. I would love to utilize this worker and help out in any way I can.

TugT4G commented 3 years ago

@samouri I am sure you are on it and well aware but if I could be of any help there are WordPress cookies that the worker is probably caching that shouldn't be cached? For example, when I enable Cloudflare's cache everything and choose respect headers for the caching mechanism, it will also need some page rules that only business+ members get as explained here - https://support.cloudflare.com/hc/en-us/articles/236166048-Caching-Static-HTML-with-WordPress-WooCommerce.

This comment might be long overdue but I am assuming that it also involves cookies that the worker needs to bypass?

Not sure what you are finding out but I hope this helps!

TugT4G commented 2 years ago

Hey samouri,

What is the status of this repo?

Best regards, Tug