Compass results are inaccurate in some circumstances

jaustin commented 7 years ago

This is an issue to track some of the increased user-support requests we've had since Dr Who Live Lesson raised the priority of the compass.

Issue as users experience it: The compass doesn't give a full range of values, and appears to be 'stuck' in one direction

Our current understanding: There are two confounding factors: 1) The micro:bit compass is very affected by the battery and other ferrous metals in the environment. Calibration should (must?) be performed in the same situation as the compass will be used. For example, people very rarely see issues when calibrating over USB and using on USB.

2) In certain circumstances the calibration algorithm chooses an incorrect (read: miles away) centre for the location of the calibration test points. This is most common when the battery is attached because the battery transforms the usual calibration point-cloud from 'points on the surface of a sphere' to 'points on the surface of a brioche bun shape' (Wikipedia tells me this is an oblate spheroid, but brioche bun will do). This increases the liklihood that the calibration algorithm will find an incorrect centre.

@finneyj has some good 3D plots, to add. Here are some images from @DavidWhaleMEF that show the issue:

Calibrated on USB, used on USB: redo_cal_usb_use_usb

Calibrated on USB, used with battery: redo_cal_usb_use_bat

Calibrated on battery, used on battery

redo_cal_bat_use_bat

The current assumption is that we can remedy this with some adjustments to the calibration algorithm to detect a pathological calibration failure.

jaustin commented 7 years ago

@markshannon @dpgeorge headsup

finneyj commented 7 years ago

@jaustin Thanks for raising here. Our foundation helpdesk tickets can be a bit hidden sometimes.

I spent a day or two analysing this and found a number of factors that contribute to this issue and generally impact on performance. Documenting results here...

1) The current least-mean-squares algorithm sometimes gets it really wrong.

To use a magnetometer as a compass, you essentially need to calibrate out a number of sources of error. The primary one is determining the "zero point" of the raw data readings. The current implementation gathers a set of data points then uses a least-mean-squares algorithm to generate an "ideal" zero point that places all those samples closest to the surface of a sphere. This is the algo cited in the application note for the MAG3110 magnetometer chip we're using.

Normally, this algo does a decent job of this, but sometimes (and this seems to depend on device and external conditions more than anything) it interprets the samples as a chord to the sphere, and places the zero point off in space somewhere. This results in a spectacularly bad calibration, and I suspect this is a cause of the very worst of the reports.

For reference, the first 3D image below gives an idea what a clean calibration looks like (green dots are points used for calibration, blue dots are data points collected only for visualization/validation and the red spot is the determined zero point). The second image shows a calibration where the data is misinterpreted as a chord...

compass-normal

compass-batterypack

2) The effect of battery packs A common use case for micro:bit is to blu-tack/tape the 2xAAA battery pack onto the back of the device. We've all done this. The issue here is that those batteries get taped directly onto the magnetometer, so have a huge effect on the sensitivity of the device. By my measurements, you get a big shift in zero point and compression across multiple axes to form a "Brioche", rather than a sphere. This results in the compressed readings David reported above.

To help visualize, the image below shows raw output from the same device and same software, both with and without a battery pack.

compass_both

3) The effect of battery packs vol.2 In addition to the effects above, small changes in the battery pack position and orientation have a very pronounced effect on the data. Like, moving the battery pack by a few millimetres renders the calibration pretty much useless because this affects the raw data by several times the scale of the earth's magnetic field. e.g. check out the raw data output below. This is under controlled conditions where the only thing moving is the battery pack, by less than 10 degrees of rotation. See the effect in the Y and Z axes (second and third number listed on the "DATA" line). In other words, the calibration is quite brittle, and can slip depending on usage - and kids aren't very careful.

I've worked up some changes to try to address these, with mixed success:

Replaced the least-mean-squares algo with a mean average followed by iterative hill descent algorithm to find an optimal position within the point cloud. This seems to work pretty well (and as a bonus has a lower RAM and FLASH footprint).
Added a scaling factor in each of the axes, determined by "inflating the brioche" point cloud to fit a sphere. This looks reasonable in R simulations and seems to improve performance with the battery pack attached, and has no effect on a device without a battery pack, so I think this is a decent approach, but I don't have any hard results.
Added a simple tracking algorithm to try to handle dynamic shifts in the zero point, by determining when data points are sampled somewhere "unexpected" (outside the sphere radius) and slowly walks the zero point iteratively toward toward such points. This works pretty badly at the moment - probably too naive an algorithm...

Overall, in my results, performance is improved with these changes (especially with battery pack) but is still brittle - the compass heading output is accurate fr a while, then drifts off when something sneezes...

markshannon commented 7 years ago

Calibrating the compass under one set of conditions and then using under another is never going to work. We should document this properly rather than attempting to do the impossible.

@finneyj What do you mean by the "least-mean-squares" algorithm? The least mean square of the offset is the mean position.

markshannon commented 7 years ago

Could you please keep the code uncoupled from the DAL memory allocator, fibers, etc? It would be nice to use any solution you come up for micropython, rather than having to rewrite it (again).

finneyj commented 7 years ago

@markshannon

Agree - calibrating under one condition and changing to something very different (e.g. calibrating on USB only then running with a battery pack and/or integrating said micro:bit into a giant iron death robot) is rather a fools quest. I do wonder about the common case of using a micro:bit with the supplied battery pack though - this feels a bit more bounded, and something that kids and teachers would expect to work, so let's see we can do something to make that use case more robust.

As far as I understand it, the current least-mean-squares algorithm used here is determining the "best" centre point of a sphere that minimizes the combined error of placing all the associated data points on the surface of that sphere. I think this is subtly different from the mean position of the samples. it would be (I think) the same as the mean position of all possible samples - but not necessarily the ones you have. e.g. if all your sample points are only the northern hemisphere of the globe (say Lancaster, Oxford, Seattle, Moscow), the mean would be nowhere near the earth's core. Whereas, given accurate data, the least mean square algorithm would still locate the centre of the earth. In other words, it should be more tolerant of a biased set of data points. In the pathological case above however, it placed the earth's centre somewhere near the moon, as it thought surface was bending the other way.

p.s. yes, should be easy enough to integrate into micropython. The fiber scheduler has been optional for all drivers since v2.0 anyway (save one short function to determine if there is or isn't a scheduler running), and no use of heap allocation here I don't think.

DavidWhaleMEF commented 7 years ago

Just a note of practicality here. Calibrating the compass inside a bigger device is often very hard to do. It's very hard to pick up and tilt an iron-death-robot in a sphere. The only way to calibrate here is to remove the micro:bit and calibrate it, and then put it back in.

Sorry if that sounds tongue-in-cheek, it definitely isn't. When I did the stint on The One Show last year, it wasn't practical to run the calibration procedure on the device we attached the compass bit to, so we had to calibrate away from the device and then fit it. This is also the same on the IET Acceleration case study linked here: http://faraday.theiet.org/stem-activity-days/bbc-microbit/case-studies/index.cfm (the Abbie Hutty video is 1 minute long and well worth a look).

We may have to deal with this with messaging to explain that performance in these situations is degraded. But it's definitely a use case that is very common and we can't just rule it out completely.

finneyj commented 7 years ago

One could argue that any self respected iron-death-robot worth its weight should be agile enough to perform such a maneuver... but it is a relevant point. There are lots of possible use cases.

If I were building such a robot, I think i'd be looking to use a different algorithm that suited the use case. e.g. most likely this thing would be running on wheels/tracks in a horizontal plane, so I'd use the basicBearing() function to generate a 2D only, non-tilt compensated reading and write my own simpler calibration algorithm and then poke in the calibration data directly. Just spinning on the spot (the really old micro:bit algorithm) would work well here as it would allow the calibration to be performed in situ with the motors running - ideal conditions. We shouldn't do this in the general case though, as we know more common use case is kids/teachers holding a micro;bit in their hands.

The building blocks are there to enable this - not sure how much is surfaced above microbit-dal though.

p.s. Sadly I don't think I would get "iron-death-robot" through Lancaster's ethics committee...

DavidWhaleMEF commented 7 years ago

The following two app notes from NXP are relevant here, especially the latter one (as I think @finneyj has probably based all his work on the first one already)

AN4248 Implementing a Tilt-Compensated eCompass using Accelerometer and Magnetometer Sensors

AN4246 Calibrating an eCompass in the Presence of Hard- and Soft-Iron Interference

markshannon commented 7 years ago

The maths in those documents assume 1000s of points in the data and reasonably complete coverage of the spheroid. The current calibration "game" provides 12 points on a circle.

DavidWhaleMEF commented 7 years ago

For completeness, this is the program that I wrote and used to generate the circular profile diagrams in this ticket, in case anyone wants to replicate the testing on their micro:bit

https://github.com/whaleygeek/mb_compass_tester

jaustin commented 7 years ago

@finneyj I think you had some great progress on a mitigation to make the low-point-count calibration behave more appropriately (or at least detect way out of whack decisions). Any update on that? It would be nice to try to include any fix in the next release of the DAL that will go into PXT/MakeCode stable....

finneyj commented 7 years ago

@DavidWhaleMEF Thanks for sharing app notes - yes, these are indeed the docs I based the 2nd generation implementation on (the first being the early version that didn't support tilt calibration).

@jaustin I haven't had chance to undertake any more investigation beyond that described above, but the first two bullet points described above are implemented on a micro:bit and seem to work "ok" for me under stable conditions (either on USB or with a battery pack in a very stable position). It may be worth updating to this is we are confident it is indeed an improvement.

I would like to see some more tests if we can. Perhaps I could work up a branch/hex file that @DavidWhaleMEF could share with his contacts reporting issues to see if this shows improvement?

DavidWhaleMEF commented 7 years ago

Yes sure @finneyj I have some tickets of people who have reported issues, that they have already said they would love to help test anything new that we have.

finneyj commented 7 years ago

OK. I'll work up a HEX file they can try out. Any suggestions for a good test "app" that they may be familiar with, or that would make a good comparison with what they've seen already?

DavidWhaleMEF commented 7 years ago

Find the TARDIS...

http://www.bbc.co.uk/programmes/articles/3ydvd6mvhl89cHVJ7F2nmzf/doctor-who-and-the-micro-bit-live-lesson

DavidWhaleMEF commented 7 years ago

@finneyj If you could post a hex file to me that I can get our test customers to use, that would be great, thanks.

finneyj commented 7 years ago

thanks for the nudge @DavidWhaleMEF. Have done some more testing and am now thinking we also have a minor issue in the tilt-compensation algo. I don't think it's wrong, but I equally think it isn't what we want either. :-)

I'll get you something tomorrow one way or another.

jaustin commented 7 years ago

I've taken Joe's branch (compass_calibration_improvements) and built 3 different hex files for testing. compass test.zip

Two of them use a new UX where the 'yet to be touched' points are dim and the cursor flashes.

finneyj commented 7 years ago

awesome - thanks for helping out here @jaustin - appreciated.

jaustin commented 7 years ago

Feedback from the first user that we've sent this to (they had 10 devices all not calibrating well):

Yes, algorithm A and B fixed my problem! However, C does not.

C was the control, the original algorithm.

@finneyj I know this isn't yet as good as you'd like but it does seem to represent an improvement. Can we roll this into the next DAL version please?

jaustin commented 7 years ago

Okay, based on the user feedback we've got now we have to make the following decision:

Either 1) Use the new algorithm with 'Tilt to fill a circle' and a 12-point circle. People 'get' this easily

OR 2) Use the new algorithm with 'Tilt to fill the screen' - this is a change in method but would give us 25 points of calibration data.

The method used in 'B' above (draw a 21-point circle) seemed to confuse people because it doesn't appear to be a circle to many, whereas the 12-point circle does.

@finneyj and @DavidWhaleMEF (1) is clearly lower impact in terms of documentation. What we don't have good data on is the accuracy difference between the 21 and 12-point calibrations - Joe do you have anything there?

I'm inclined to go with the 25-point "Tilt to fill the screen" at the moment. Anyone else?

DavidWhaleMEF commented 7 years ago

our friend (year 6) at IET OpenHouse said he found the big circle one confusing.

I also found that pretty hard to do, as it was not that clear which dots had not been filled in as the contrast between set and not set was very hard to see. Also it didn't really look like a circle, it looked like a ball.

We should also think what happens with visually impaired people (actually I'm not visually impaired and even I found it very difficult to see which dots had or had not been filled). I think the three LED states is too inconsistent to rely on as part of a new UX to be honest.

I really like 'tilt to fill the screen'. It also gives us lots of data points which is most likely going to improve accuracy of the calibration, which is the whole point of making this change.

Be careful with those faded LEDs though, I found that really confusing.

I think you need to think through what to do when the screen has no dots on it - perhaps always fill at start the centre dot (as you'll probably have or will get that point anyway), then as you are tilting it fill any other dot that it detects. Then show the smiley face when you have achieved it, so there is a clear point in the documentation we can refer to as 'you have done it'.

Just a side note, once you have this working, it's pretty much the screen design for my 'paint-roller' game that I showed at CBBC Live and Digital, Hull, 2015 - and people really understood the purpose of the UX on that one - all we had to say was 'shake and tilt to fill the screen' and they were off (even the really tiny users loved it!)

jaustin commented 7 years ago

I'll work up a test for the 'whole screen' method as this has found wide favour with people when discussed, but hasn't been tested in person yet.

jaustin commented 7 years ago

I've built the 'whole screen' version and it's not very easy to get the top corners without changing the threshold to 700, which I've done, but @finneyj not sure if that'll reduce the quality of the overall calibration?

microbit-compass-D.hex.zip

jaustin commented 7 years ago

https://github.com/lancaster-university/microbit-dal/tree/compass_full_screen_calib here's the branch I've been working off if someone wants to reproduce

finneyj commented 7 years ago

Thanks all - just catching up here.

@jaustin Thanks for the dev and test and sharing the results of the trial. Great to hear the updated algorithm improved matters, and equally reassuring that the placebo resulted in a fail case - gives us confidence. In all my tests the new algorithm was always at least as good as the old one (normally about the same, but didn't exhibit those big outliers). Based on this, I agree we should look the merge this new algorithm as a bugfix/improvement.

So this leaves us the question of the UX and number of calibration points to gather... I don't have hard data on how the accuracy varies with the number of sample points, but intuitively (based on experiences when I was generating the analyses above), the spread of samples is more important than the absolute number of samples. So generally speaking, the while screen method should perform better (as those 4 corner points will carry quite a bit of information). @jaustin Reducing the thresholds to 700mg is fine IMHO - these are fairly arbitrary thresholds simply there to encourage a good spread of samples. Teh additional sample points would more than make up for the minor compression around the axis aligned points.

IMHO, this boils down to two UX questions:

1) Do users find the full screen method more intuitive than the circle method, and 2) If so, is the benefit to changing the UX slightly worth any confusion caused by changing a known UX?

Thoughts on a postcard please...

jaustin commented 7 years ago

We've done a bit of informal testing on (1) and (2) and the answers were certainly 'yes' and 'yes'

This is most significant when moving to a 'filled' circle, the empty circle worked fine, but didn't have enough points, the filled circle confused people.

The instructions for filling the screen are really just as easy, and with the added flashing dot code there, it works pretty intuitively. So far testing in the foundation was also very positive about how easy it was.

J

On 11 September 2017 at 11:53, Joe Finney notifications@github.com wrote:

Thanks all - just catching up here.

@jaustin https://github.com/jaustin Thanks for the dev and test and sharing the results of the trial. Great to hear the updated algorithm improved matters, and equally reassuring that the placebo resulted in a fail case - gives us confidence. In all my tests the new algorithm was always at least as good as the old one (normally about the same, but didn't exhibit those big outliers). Based on this, I agree we should look the merge this new algorithm as a bugfix/improvement.

So this leaves us the question of the UX and number of calibration points to gather... I don't have hard data on how the accuracy varies with the number of sample points, but intuitively (based on experiences when I was generating the analyses above), the spread of samples is more important than the absolute number of samples. So generally speaking, the while screen method should perform better (as those 4 corner points will carry quite a bit of information). @jaustin https://github.com/jaustin Reducing the thresholds to 700mg is fine IMHO - these are fairly arbitrary thresholds simply there to encourage a good spread of samples. Teh additional sample points would more than make up for the minor compression around the axis aligned points.

IMHO, this boils down to two UX questions:

Do users find the full screen method more intuitive than the circle method, and

If so, is the benefit to changing the UX slightly worth any confusion caused by changing a known UX?

Thoughts on a postcard please...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lancaster-university/microbit-dal/issues/288#issuecomment-328494387, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI-qYAPykKQwBPxu17AermyQk6haVOBks5shREzgaJpZM4NN_aI .

finneyj commented 7 years ago

Thanks @jaustin - sounds conclusive. Let's move to the new algo with the full screen UX.

jaustin commented 7 years ago

@finneyj given the recent conversations with @dpgeorge about making the calibration take function pointers for scrolling text and displaying an image, shall we do this all at once? If you haven't started I might have a stab or get Sam to have a look

microbit-mark commented 6 years ago

Some more user testing has been done on https://github.com/lancaster-university/microbit-dal/issues/288#issuecomment-318371281 and confirm that files A and B fix the issue. See https://twitter.com/BitawsBrackley/status/992753605533986817 https://twitter.com/BitawsBrackley/status/994662665581678593 https://twitter.com/BitawsBrackley/status/994661273781628928

microbit-pauline commented 6 years ago

@finneyj Is this ticket complete?

finneyj commented 6 years ago

Time will tell... You could spend your life optimising this one. :)

I think we're complete enough to close this issue now though, and reopen again if users continue to report problems.

andreipavlevich commented 2 years ago

Hi, I'm facing with the same issue when I try to use compass when micro:bit is attached to a Tiny:bit board - http://www.yahboom.net/study/Tiny:bit

I understand that the problem is EMI of the board but is there any workaround?

lancaster-university / microbit-dal

Compass results are inaccurate in some circumstances #288