Staging server referenced from production

NickWeir63 commented 8 years ago

I just signed up 3 new stroudco producers and they got the email below which takes them to staging instead of production. I checked this with Sara and she thought that @pmackay had fixed this. How can i confirm these new producers on production? thanks N screenshot 2016-02-08 15 20 12

lin-d-hop commented 8 years ago

Heya, this was actually something I looked into and 'fixed'.

Producers are definitely on production. The issue is that some config reset on deployment, so the wrong site_url is currently stored in the config table. We are unsure why and the Aussies say this hasn't happened before to them.

The fix was done from super-admin. However we seem to be having an additional issue of the prod database not always updating correctly and giving no warning to the user. I am not sure if this is a code issue or a DB issue. I am looking into this and related issues I've flagged in my recent work.

lin-d-hop commented 8 years ago

In the mean time I will fix the config and close this issue.

NickWeir63 commented 8 years ago

thanks @lin-d-hop there are now 4 new producers on Stroudco. We are asking them to log in to add their products to OFN. What password should they use to login? or can i get them to use the 'forgot password' option?

lin-d-hop commented 8 years ago

yeah forgot password option :-)

NickWeir63 commented 8 years ago

aaargh - password reset is going to staging too (see copied email below) is there any way we can get these producers logged on?

A request to reset your password has been made. If you did not make this request, simply ignore this email.

If you did make this request just click the link below:

http://staging.openfoodnetwork.org.uk/user/spree_user/password/edit?reset_password_token=6clVwbN9yEdlI9zTA1Cr

If the above URL does not work try copying and pasting it into your browser. If you continue to have problem please feel free to contact us.

lin-d-hop commented 8 years ago

Can you just copy the link without the staging. at the start?

It is just that the first part of the link it coming from a different variable to that which is configured in the super_admin. I'm still trying to track down where it is being set and why it was overwritten on deployment.

lin-d-hop commented 8 years ago

Ok I think this is to do with the changed spree_preference entry being stored in cache and not looked up from the DB after it has been changed. That, at least, fits with the behaviour... still not fixed yet though :-/

lin-d-hop commented 8 years ago

Hi @oeoeaio, @RohanM, @mkllnk, My digging on this issue led me to believe on the UK server spree_preference cache is not always updating, so persisted variables aren't being used. This issue is one example. On deploy to prod the site_url set to staging (a separate issue) and updating in the super_admin config changes didn't flow through.

Just wondering if this is something you guys have encountered at all? If you have work arounds? Or if you haven't then either I'm on the wrong track or there is something different about out env.

Thanks for any quick thoughts :-)

mkllnk commented 8 years ago

I never experienced that. A lot of deploy issues had to do with the wrong environment. If you haven't done already, check all scripts involved in deployment if they declare production environment.

NickWeir63 commented 8 years ago

This is becoming urgent as we can't set up new producers on the Stroudco database. We have tried copying the link without the staging. at the start but although this logs you in to production it does not confirm the new account.

NickWeir63 commented 8 years ago

@lin-d-hop i cant remember who we said we were going to ask to help with this?

lin-d-hop commented 8 years ago

Hi @RohanM @mkllnk @oeoeaio @daniellemoorhead Sorry to pass this issue over to you. Paul and I both had a look and did not get closer to a fix, which feels worryingly inadequate!

To give some more detail:

This issue started after our most recent production deploy. A number of vars reset to point to staging. Anything configurable within spree were reset but the spree.root_url is not configurable within Spree_Config (the UI config setting is not used, and this feels like the desired behaviour)
We only see this issue in emails. The site itself fuctions correctly with internal links working. Links that appear in emails point to staging.openfoodnetwork.org.uk. This means that new user/producer sign ups and reset password links just don't work. Hence this is a priority.

We'd really appreciate a little update on any finding from your side to help with our understanding. Thanks so much for taking a look!

oeoeaio commented 8 years ago

Hi @lin-d-hop, @pmackay,

spree.root_url is just a helper method which works exactly like admin_url or new_admin_enterprise_path, it's just scoped to a spree route rather than one of ours. The fact that is it a url means that the helper knows it is supposed to tack a host onto the start.

To generate urls, the helper uses the hostname provided by ActionMailer::Base.default_url_options[:host], which we don't set explicitly, but which is set to the value of Spree::Config.site_url within Spree during initialization in: core/lib/spree/core/mail_settings.rb, via core/lib/spree/core/engine.rb

This presumably means that although we can change the value of Spree::Config.site_url from the admin UI, it will not resolve our issue because the value of ActionMailer::Base.default_url_options[:host] will not update until the next deployment.

So then then it seems like the issue is: if your value for Spree::Config.site_url is being reset to staging.openfoodnetwork.org.uk at every new deployment, then you are never going to be able to get the right value into ActionMailer::Base.default_url_options[:host] unless you change it directly from the console.

Do you know what the issue is that is causing Spree::Config.site_url to be set to staging during deploy? Sounds like an environment issue in your scripts as @mkllnk said...

mkllnk commented 8 years ago

I did a few things on your production server. The bad news: I don't know what caused your issue. The good news: It seems working fine now.

One good thing to know is that when you change a variable like Spree::Config.site_url through the OFN site or through the database, other processes don't pick that up. In our case, it means that delayed_job doesn't know about the update and will still send emails containing the old URL unless you restart delayed_job. That can cause a lot of confusion.

I also noticed that there were a lot of pending updates on your server. That has nothing to do with this problem, but security updates are quite important (e.g. CVE-2015-7547).

So I installed updates via aptitude upgrade and restarted the whole server via init 6. After the restart I noticed that the OFN app was not starting up automatically. I ran update-rc.d unicorn_openfoodnetwork defaults to enable startup during boot. The current Ansible scripts should have done that, but maybe not at the time the UK server was provisioned.

After the restart the emails were sent correctly. To complete the test, I used Buildkite to deploy the new release v1.5. Everything seemed to work fine and I still got a welcome email with the right link to the production website.

So I still don't know how you got the staging data on your production server during deploy, but I hope it doesn't happen again.

Please test if everything is working as expected.

NickWeir63 commented 8 years ago

Yes thanks @mkllnk this is working perfectly now so will close this big thanks to all who worked on it N

pmackay commented 8 years ago

@mkllnk any idea why the order cycles on the staging db might have gone missing? @NickWeir63 reported this when testing earlier.

mkllnk commented 8 years ago

No. I just deployed to production which is saving the staging database to a file, but not changing it. Or did that happen when you staged version 1.5? The normal CI scripts are resetting the staging database to the point of the last production deployment. If you had another test branch in staging before, that data got lost. We do that, because some branches do migrations which have to be reverted before staging another branch that doesn't include these migrations.

openfoodfoundation / openfoodnetwork

Staging server referenced from production #821