Website Maintenance - Githubissues

PhyxionNL commented 9 years ago

Maybe add the reason why it is in maintenance mode, like what is exactly down.

niemyjski commented 9 years ago

We provide this already on the api status page as well as on the UI Projects status page. Are you not seeing this behavior? If so, please let us know.

PhyxionNL commented 9 years ago

It's the self hosted version. When I checked the status page it didn't show any information about what was down (in my case: redis). But even if it was at the status page, it would still be useful to show at least some information about why it is in maintenance mode. As far as I could see (at least in 1.x branch) it only goes to maintenance mode when the db or redis is down and there is no option to 'manually' trigger it. This seems a bit weird to call it maintenance, as it's more like a system/component failure.

niemyjski commented 9 years ago

How do you feel we can make this even better? In some cases we don't have a status message (api goes down for a reboot/etc).

PhyxionNL commented 9 years ago

To clarify it a bit more, even something along the lines of:

The server is down because a required service isn't started.

would be better as I had no idea why the maintenance mode was triggered and had to look in the source to find out. It would be even better to show the missing service, but at least now you have some idea where to look.

niemyjski commented 9 years ago

Are you running 2.x or 1.x now? We should be sending down the reason as part of the message:

if (!result.IsHealthy) return StatusCodeWithMessage(HttpStatusCode.ServiceUnavailable, result.Message);

Is there any chance you could test this by taking redis down and let me know if you don't see a json result that contains a message.

PhyxionNL commented 9 years ago

This is the 1.x branch, we still have to update to 2.x.

niemyjski commented 9 years ago

I'll be creating a 2.0 release today in github with the binaries. We have jobs that get run that will migrate things over to 2.0.

niemyjski commented 9 years ago

I'd recommend forking the 1.x release and making this change as 2.0 already has this functionality. Do you have any plans on upgrading to 2.x?

PhyxionNL commented 9 years ago

Yes, we would like to upgrade the existing data to 2.x. We used to just compile the 1.x ourselves instead of using direct binaries. Can we still do this for 2.x (i.e., just compile 2.x and overwrite the existing server files), will the data then automatically migrate to 2.x or do we have to run something manually?

niemyjski commented 9 years ago

It's a bit more work to be honest and we haven't really documented much. Below is an overview of what you'd need to do.

Stop the 1.x site.
Back up everything
Install Java and ElasticSearch (Events and stacks are stored in elastic search now).
Run the redis-cli and type flushdb.
Replace the 1.x site with the 2.x site and update all of the configuration settings.
Run the StackMigrationJob
Run the QueueEventMigrationsJob
Run the EventMigrationJob.
Start the website (then the mongo migrations will run).
When it's safe drop the error and errorstack mongo collections.

PhyxionNL commented 9 years ago

How to run the Jobs? It seems to require a Job.exe in the JobRunner folder, but I don't see it anywhere.

niemyjski commented 9 years ago

The job.exe can be found here: packages\Foundatio.1.0.142\tools and then you just pass the job as a parameter

job.exe -t "Exceptionless.Core.Jobs.EventPostsJob, Exceptionless.Core" -c

You want to omit the -c for this stuff. It means to run it continuously, which you will want for only the EventMigrationJob.

niemyjski commented 9 years ago

If you could write up a walkthrough using what I gave you that would be greatly appreciated. We could put it on the wiki for others to use :)

PhyxionNL commented 9 years ago

I finally managed to get something to run, although I'm not entirely sure if the data migrations worked correctly. I do see the projects and organizational data, but all exceptions are gone.

niemyjski commented 9 years ago

Is there anything in ElasticSearch? Did you set the migration connection string that points to the same mongodb instance?

PhyxionNL commented 9 years ago

Elasticsearch reported many errors (failed to parse) with the action.bulk methods. I don't have time anymore today to look into this. Maybe I'll check again tomorrow.

Exceptionless 1.x already used quite a lot of packages and services but 2.x is using so many different components/services that it becomes difficult to find where the problem lies. It all seems like a bit too much for some logging.

niemyjski commented 9 years ago

We are only using (Mongo, ElasticSearch and Redis). In the not to distant feature we will move projects / users and organizations into ElasticSearch and get rid of Mongo. We wanted to wait because it's completely new and we wanted it to bake. If something goes completely wrong with ElasticSearch right now, we can replay all the events that came in (it would be hard to do this with users, projects and org data). If you want, I could meet up with you and help you with the migration.

PhyxionNL commented 9 years ago

Yes, but it also requires nodejs/bower/grunt, unlike 1.x which you could host through IIS. This removes a lot of flexibility as with IIS it was easy to add authentication (we don't want to have the site open to everyone as there is no option to disable registrations). I appreciate the offer to help with the migration, I'm absolutely interested. I don't have much time anymore to look into this this week, but perhaps next week is a possibility. Will 1.x receive continued support?

niemyjski commented 9 years ago

Actually we don't require any of that (We will be using github releases where it's all prebuilt). You just need to unzip the files and put them on any kind of web server (IIS included).

niemyjski commented 9 years ago

I'll be creating a github release today.

PhyxionNL commented 9 years ago

Thank you for the additional information, looking forward to that. Would the config still be in the grunt config file or do we have to set the api location in the web.config then? I hope I can find some time at the end of this week, otherwise I'll check again next week.

niemyjski commented 9 years ago

No all the config would be in a cache friendly app.config.*.js file. You would just need to edit it with notepad and set the server url and any other options :).

I'd be more than happy to help you get this setup and would love any feedback that you may have.

PhyxionNL commented 9 years ago

Ah, yes, I completely forgot about that file. I'm sure I can manage to set up the website through IIS, but with the data migration I might require some assistance as it failed both times I tried it. I suspect it went wrong with the Job.exe, I'm still not entirely sure how this should be called. What I did:

Set up the api on IIS (including connectionstrings)
Copy Job.exe (including the dlls / configs from Job.exe) to the root of the IIS site.
Copy the EventMigration dll to the bin folder (in the root of the IIS site).
Run Job.exe -t "Exceptionless.EventMigration.StackMigrationJob, Exceptionless.EventMigration". This also for the other two jobs. The first takes a while (5 seconds) but the other jobs return immediately. There wasn't any output either from job.exe besides a blank line. I'm not sure if prints anything?

Maybe I'm doing this wrong, but given the amount of errors reported by the Elasticsearch instance I assume it did something.

niemyjski commented 9 years ago

I'd use the zip file from the release we created yesterday and run everything from there (check the app data folder). When you run the job.exe from command prompt you'll get all of the console messages written out after the process quits. I'd recommend wiping the ElasticSearch instance and starting over.

PhyxionNL commented 9 years ago

The command I had written above, is that the correct way to call the migration(s)? I noticed there is a jobs folder but it doesn't have .bat files for the migrations (at least I couldn't find it).

niemyjski commented 9 years ago

Yeah, I wouldn't run those bat files unless you set an environmental variable %WEBROOT_PATH% that they are using. Those are for azure web jobs. We should probably update them to check for both but we mainly host on azure. Since this is a one time thing, I'd just manually shell them.

niemyjski commented 9 years ago

I've created a UI release (https://github.com/exceptionless/Exceptionless.UI) and updated the readme with hosting information.

PhyxionNL commented 9 years ago

Any reason why several files have hashes in their names? Like app.config.b9a9517b4d5b2684.js. This suggests that it will not receive the same name when another release is created and making updating a little more complicated.

niemyjski commented 9 years ago

Yes, the names have hashes so we can cache them for a really long time :). Then when we update the site the new index.html file will be processed and the new config settings/scripts will be processed as well (it's our cache buster). So the filename portion app.config will never change, just the hash.

niemyjski commented 9 years ago

We recommend wiping out the whole directory when updating the spa app, and then just editing the app.config file.

PhyxionNL commented 9 years ago

Fair enough, it's not that there are that many configurations anyway. I'll check it out when I have the time, thanks for all the info!

niemyjski commented 9 years ago

Most setups will usually only have to configure the server url and maybe the web.config. We have an ssl redirect in the config you may want to edit if you aren't going to be using ssl.

ejsmith commented 9 years ago

I would recommend just running the jobs in memory for simple setups. We want it to be super easy to run. So I think we are going to want to create a separate release process that configures the app to be self hosted. In that scenario, we could actually just put the UI contents into the same folder as the API.

niemyjski commented 9 years ago

I'm going to close this issue due to the maintenance page in 2.0 states the reason why it's offline. Can you please create a new issue if you run into any issues or need additional help with the jobs.

exceptionless / Exceptionless

Website Maintenance #65

The server is down because a required service isn't started.