Multibit-Legacy / multibit-hd

Deprecated Bitcoin Wallet
https://multibit.org/blog/2017/07/26/multibit-shutdown.html
Other
173 stars 114 forks source link

Require secure error reporting solution #168

Closed gary-rowe closed 9 years ago

gary-rowe commented 10 years ago

The public beta or first release will need a secure error reporting mechanism for informing us of problems with the application. This will need to include:

The user can choose not to send the report and can see all the information that is going to be sent. The send process will encrypt it using our public key so that only MultiBit can read the contents of the report.

We will require an endpoint on the site to receive, throttle and filter reports.

Ideally this would be a commercial application but we may have to roll our own if we can't find anything suitable.

ghost commented 9 years ago

@mike-hearn mentioned that Lighthouse has a nice error reporting tool that is open source that we could use. Will need to reskin it for Swing

gary-rowe commented 9 years ago

Agreed. Let's reuse where we can

gary-rowe commented 9 years ago

Further research into this has lead to a combination of the following:

  1. Use a custom exception handler to grab the exception and triage for its importance
  2. If important enough to submit it should be packaged with a ZIP of the local logs for the day
  3. GPG encrypted with MultiBit.org BRIT key (already present on client)
  4. Submit to BRIT server over HTTPS for decrypting on the server
  5. Logs and relevant anonymous meta data (Operating System, MultiBit HD version etc) sent to the log analyser (on a separate secured machine)

The log analyser could be Elastic's Logstash with visualization provided by Kibana all brought together with Elastic Search. All are apparently available for free without support.

Here's a handy tutorial.

gary-rowe commented 9 years ago

Further discussion offline has lead to adding a manual error reporting button in the Help command bar. This will allow people to report a problem directly that doesn't cause an error. While this could be open for abuse it should allow better insight into any problems that users are experiencing.

In terms of feature delivery here's a plan:

1) Create a low level JDialog containing the following localised fields:

2) Integrate error reporting client side

3) Implement the request and response

4) Server processing

Notes: Later on we could go introduce a Location header in the response to indicate the URL of a Help article that explains a fix. This would require an extra button on the dialog that simply links to the outside world but requires a lot of back end parsing to correctly identify in most cases.

If we're happy with this plan I'll split this epic up into smaller tasks accordingly.

gary-rowe commented 9 years ago

I've loosely based it on the Mozilla Breakpad concept. If a crash occurs the user first sees it in the minimal mode to avoid fear and alarm: screen shot 2015-04-19 at 11 43 01 Clicking "Details" allows for closer inspection of content: screen shot 2015-04-19 at 11 42 32

To see this in action you need to trigger an exception. The quickest way is to add this line to CoreServices#168:

throw new NullPointerException("Boo!");

This will verify that the user friendly dialog can be rendered through reflection via Core calling up to Swing. This ensures that the dialog has access to the i18n features present in Swing, but still allows for ExceptionHandler to be referenced statically throughout Core.

In the event that ExceptionHandler cannot make the reflective call, a fallback dialog is provided which can be verified by changing ExceptionHandler#69 to have an invalid package:

screen shot 2015-04-19 at 12 00 46

This will need a bit more modification to allow for cancelling the operation and more descriptive (non-localised) text.

All uploading will occur in the ExceptionHandler Core class.

Ready for interim review before proceeding to add upload capability.

jim618 commented 9 years ago

Looks good so far. You might want to make it wider/ taller so as to give more space for the text in languages like Russian. Also, you've written 'Multibit' whereas our 'style guide' alwas describes it as 'MultiBit'.

gary-rowe commented 9 years ago

Here's a quick preview of the incomplete Russian translation: screen shot 2015-04-19 at 17 23 01 screen shot 2015-04-19 at 17 23 18

I think the dimensions should be OK (and they will wrap)

gary-rowe commented 9 years ago

Added support for manual error reporting in the Help screen:

screen shot 2015-04-19 at 18 12 34

Leading to a modified error reporting dialog:

screen shot 2015-04-19 at 18 17 35

Rather than create a new "error reporting wizard" I've opted to just re-use the low level dialog. This is intended to give the users a sense of doing something "low level" and unusual. That said, we must expect that this is going to get a lot of abuse so we'll have to rely on the server having good filters.

jim618 commented 9 years ago

Note in your button text that we normally once capitalise the first word.

Looks good - I think we want the same dialog yes - KISS

jim618 commented 9 years ago

The help contents page will need adjusting to reflect the fact there is a error reporting button. We probably don't want two ways of doing the same thing. (Though screen shots in github are useful)

gary-rowe commented 9 years ago

I've added most the client-side upload requirements now. I had to fiddle about with the BRIT code since FeeService provided a lot of what I needed already. Take a look at how it's been refactored, particularly at how CoreServices.createFeeService() is just a wrapper now. Also the HttpsUtils.doPost() is a direct lift out of FeeService.

Ready for interim review.

jim618 commented 9 years ago

When the Details button is clicked the panel the error reporter can (slightly) push off the screen. See screenshot:

screen shot 2015-04-19 at 22 36 32

edit: better screenshot

jim618 commented 9 years ago

Just had a review of the error reporting code - looks good so far.

gary-rowe commented 9 years ago

Thanks for the better screenshot - makes a lot more sense now. It's a trivial fix to the text area height.

gary-rowe commented 9 years ago

Added new repo to cover Error Reporting on the server side: https://github.com/bitcoin-solutions/error-reporting-service.

gary-rowe commented 9 years ago

Been hard at work on the Error Reporting Service and it's ready for an interim review. Since it's a standard Dropwizard service you'll need to do the following:

  1. Check out the repo as referenced above
  2. Create /var/error-reporting directory
  3. Copy contents of src/test/resources/fixtures/gpg to /var/error-reporting/gpg (same as for BRIT)
  4. Start the service with ErrorReportingService.main() (Program arguments are server config.yml)
  5. Should see various notifications about the Environment being in an IDE and the big banner
  6. Install "REST client" Firefox plugin or "Advanced REST client" Chrome plugin (or use cURL)
  7. Create a text/plain POST request as follows:
POST http://localhost:9191/error-reporting
-----BEGIN PGP MESSAGE-----
Version: BCPG v1.46

hQEMA+aIld5YYUzuAQf9H8h/Wrsnlh90KEDjOI3Z4FaCgO/FZm0niBTlI2jk+oYX
oSjx18Zc8eb8pKw4kGDJx8o1ezbQxBafzx4H0gHiva8R+sMqaciXXcyebTyIPG7s
gx1S2xAgDxD5JlHMudTzpI2ZhdpZfE9e0ElSfOB9ltMqKGenBcMjwxibXw1BSrcV
sW0LrG6EeHx5RCWA3qqLG8/aelReBLyA7pVHb3X2eJT3LmAOjGe/oUjqNLjMZxFk
4pZQvr32DDXEuTKA5Rn0BhZtSflQuD2L/v6O9sDr6m/x1xKeiJZ4jzfe4RS9Atpl
xq53zFyivmitJoJtkwNkEqyWEET+yZ169JDiuFXjOtLBMwGi+F6qbbPBcXqbdGZ7
e9huU3VAl6F57yJKr6UVq7mBk8KwSDWDjUm/tnt3nJnIPK79ewmfN/LU4Fj1xFrf
QZpQAYTU+ZR9QiaWCjPRXpcAdcxk4UpclccUX/3F4wD4eN0x+2O/Q2OBY1e8+cjF
ZxyWJAyTtkyOZ9Icuk+PfMfQS4VnQ7HLEfp40PLD9OhMzSwp8UdEPNGP1bfCpOus
iH1Nlq8L2qIliXVyOKQ9CfPITNiQdwBKwKwLDwQfofJ6pReaNlVrF8kqhKKNyKdx
Lep/CIM/XG2SG4HNCjwszyOziGI4lONE1yFjIZugvibckw6BVgGmrcFdHr31fJ63
u5tttus4QSQtFyk5ms6SI3MvApBKM3sR9hbQT+M3mJyHaBknNWFUD6I/O+LtmIkx
K0XFxkyguhKqbJfWio00IU00i2aMSUuKStAjlMryMQQ2BffxXo1tQahSaizdGs1L
def2Nrpx93OoR8ny498AC9uTmbZ1OmqF+unoCyfX+JiP/xpZeZLExonoyALwFSXV
QIugqv3e8ys2nU3HDaQsLTwtacObDwX4AbbWIvW0z/5uQC7iYnn7vsFwsC6xlnpA
JT3LwzksOeCULD/aC6rQKnNoYaOZOC+BPBC+3Rg/r2wKwgBONVJ/oC2LJ7a+kY36
Im43xJ8=
=m+b9
-----END PGP MESSAGE-----

You should see a response containing OK_UNKNOWN and a decrypted copy of the payload on System.out.

Assuming it all works the next step is to set up the ELK stack and integrate with Logstash.

jim618 commented 9 years ago

Good stuff Gary. I will have a look at that tomorrow.

http://bitcoin-solutions.co.uk

On Wed, Apr 22, 2015, at 09:30 PM, Gary Rowe wrote:

Been hard at work on the Error Reporting Service and it's ready for an interim review. Since it's a standard Dropwizard service you'll need to do the following:

  1. Check out the repo as referenced above
  2. Create /var/error-reporting directory
  3. Copy contents of src/test/resources/fixtures/gpg to /var/error-reporting (same as for BRIT)
  4. Start the service with ErrorReportingService.main() (Program arguments are server config.yml)
  5. Should see various notifications about the Environment being in an IDE and the big banner
  6. Install "REST client" Firefox plugin or "Advanced REST client" Chrome plugin (or use cURL)
  7. Create a text/plain POST request as follows:
POST http://localhost:9191/error-reporting
-----BEGIN PGP MESSAGE-----
Version: BCPG v1.46

hQEMA+aIld5YYUzuAQf9H8h/Wrsnlh90KEDjOI3Z4FaCgO/FZm0niBTlI2jk+oYX
oSjx18Zc8eb8pKw4kGDJx8o1ezbQxBafzx4H0gHiva8R+sMqaciXXcyebTyIPG7s
gx1S2xAgDxD5JlHMudTzpI2ZhdpZfE9e0ElSfOB9ltMqKGenBcMjwxibXw1BSrcV
sW0LrG6EeHx5RCWA3qqLG8/aelReBLyA7pVHb3X2eJT3LmAOjGe/oUjqNLjMZxFk
4pZQvr32DDXEuTKA5Rn0BhZtSflQuD2L/v6O9sDr6m/x1xKeiJZ4jzfe4RS9Atpl
xq53zFyivmitJoJtkwNkEqyWEET+yZ169JDiuFXjOtLBMwGi+F6qbbPBcXqbdGZ7
e9huU3VAl6F57yJKr6UVq7mBk8KwSDWDjUm/tnt3nJnIPK79ewmfN/LU4Fj1xFrf
QZpQAYTU+ZR9QiaWCjPRXpcAdcxk4UpclccUX/3F4wD4eN0x+2O/Q2OBY1e8+cjF
ZxyWJAyTtkyOZ9Icuk+PfMfQS4VnQ7HLEfp40PLD9OhMzSwp8UdEPNGP1bfCpOus
iH1Nlq8L2qIliXVyOKQ9CfPITNiQdwBKwKwLDwQfofJ6pReaNlVrF8kqhKKNyKdx
Lep/CIM/XG2SG4HNCjwszyOziGI4lONE1yFjIZugvibckw6BVgGmrcFdHr31fJ63
u5tttus4QSQtFyk5ms6SI3MvApBKM3sR9hbQT+M3mJyHaBknNWFUD6I/O+LtmIkx
K0XFxkyguhKqbJfWio00IU00i2aMSUuKStAjlMryMQQ2BffxXo1tQahSaizdGs1L
def2Nrpx93OoR8ny498AC9uTmbZ1OmqF+unoCyfX+JiP/xpZeZLExonoyALwFSXV
QIugqv3e8ys2nU3HDaQsLTwtacObDwX4AbbWIvW0z/5uQC7iYnn7vsFwsC6xlnpA
JT3LwzksOeCULD/aC6rQKnNoYaOZOC+BPBC+3Rg/r2wKwgBONVJ/oC2LJ7a+kY36
Im43xJ8=
=m+b9
-----END PGP MESSAGE-----

You should see a response containing OK_UNKNOWN and a decrypted copy of the payload on System.out.

Assuming it all works the next step is to set up the ELK stack and integrate with Logstash.


Reply to this email directly or view it on GitHub: https://github.com/bitcoin-solutions/multibit-hd/issues/168#issuecomment-95326275

jim618 commented 9 years ago

I built the project and posted the test post successfully. Looks good.

One gotcha was when I tried running it on the command line I got:

Crypto keys Exception in thread "main" java.lang.NoClassDefFoundError: org/bouncycastle/jce/provider/BouncyCastleProvider
    at org.multibit.hd.error_reporting.ErrorReportingService.main(ErrorReportingService.java:88)
Caused by: java.lang.ClassNotFoundException: org.bouncycastle.jce.provider.BouncyCastleProvider

Must be something missing from the shaded jar. I just ran it in Intellij but when we run it on the real server we'll be firing it up from a SSH command line / batch script I expect.

gary-rowe commented 9 years ago

Thanks, Jim. It's only designed for running in the IDE at the moment. There are a few things I've found along the way that'll need to be backfilled into the BRIT service as well.

gary-rowe commented 9 years ago

I've added an extra upload status message below the details indicating that upload has started and later on completed: screen shot 2015-04-24 at 20 46 02

I've also truncated the logs to 200Kb which equates to about 20+ pages of text on a standard monitor which should be sufficient to identify a recent failure without causing too much flooding of data.

Since this completes the client-side part of this issue and the ErrorReportingService project will handle issues for the server-side I'm putting this up for final review.

Ready for review and close.

jim618 commented 9 years ago

It's looking good. I found a couple of points:

In the log was:

[2015-04-25 10:44:36,896] WARN  [safe-fixed-error-reporting-0] o.m.h.c.e.ExceptionHandler - Failed to POST error-report.txt.asc ! java.io.IOException: Server returned HTTP response code: 500 for URL: https://multibit.org/error
! at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) ~[na:1.7.0_40]
! at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254) ~[na:1.7.0_40]
! at org.multibit.hd.brit.utils.HttpsUtils.doPost(HttpsUtils.java:66) ~[classes/:na]
! at org.multibit.hd.core.error_reporting.ExceptionHandler.handleErrorReportUpload(ExceptionHandler.java:306) ~[classes/:na]

It would be better if the user was told the upload failed.

gary-rowe commented 9 years ago

Couple of points here:

  1. There's not much the user can do if the upload fails (multibit.org could be gone) so I thought it would be best not to tell them so they can get on with recovering from the problem without an additional burden
  2. I moved the OS info into its own section in the upload which is present at upload time so always gets included (see OS commit message)

Thoughts?

jim618 commented 9 years ago

1. If you tell them that the upload was successful then they wil think: "Ok I will wait for the multibit guys to get back to me" (but we haven't heard anything)

Whereas if you tell them it failed then they can get on github / reddit / whatever. Also hearing that our error reporting system is broken (exogenously) is useful to us.

  1. Sounds good.
gary-rowe commented 9 years ago

Agreed. I'll add in the extra state.

gary-rowe commented 9 years ago

I've added the extra upload failure reporting. I've left the button disabled since a failure will likely take a while to fix relatively speaking and we don't want to encourage users to hang around waiting on us.

screen shot 2015-04-25 at 14 49 13

Ready for review and close.

jim618 commented 9 years ago

Tested that a failed upload gives feedback to the user correctly. Closing.

gary-rowe commented 9 years ago

Re-opening due to changes in the file format requiring commits.

gary-rowe commented 9 years ago

This is more of a maintenance update to the issue so can largely be covered by a code review. In particular verifying that error-report.json is generated and then encrypted using a debugger in ExceptionHandler to prevent the the plaintext getting deleted during the process.

Ready for review and close.

jim618 commented 9 years ago

Done a code review. Looks fine. We'll need end to end testing when the server component is done but the client side is looking good. Closing.