UWMRO / Focuser

Code for automating focus of the MRO telescope
Apache License 2.0
1 stars 0 forks source link

Generate and fit random polynomial data #9

Open parejkoj opened 4 years ago

parejkoj commented 4 years ago

@andi-bergeson will generate 6 points in (-5000, 5000) x, pick parameters for a 2nd order polynomial f, compute f(x) for those points, add Gaussian random noise, and provide the x,y data in a file attached here.

@maria8ch will fit a 2nd order polynomial to the provided x,y data and state the best fitting parameters as a polynomial: f(x) = a + b*x + c*x^2.

@andi-bergeson will report on whether parameters are "close enough" to the original polynomial.

@maria8ch 's code can be a starting point for our future fitting code.

andi-bergeson commented 4 years ago

Heyo! Attached are two csv files: "stepperseven" has the seven points generated by my chosen mysterious 2nd order polynomial and shifted by Gaussian noise, and "steppernoise" has 50 points that follows the same process as above.

I thought I'd include both, mainly because i thought the seven points looks fairly difficult to fit, just from looking. it could be that the mean/standard deviation of a parabola should be calculated differently than I chose to, but the 50pt plot looked good, so. meh. i'll attach my notebook so you can take a looksie. maria, dont peek at my function, no cheating maaaaannnnnn

andi-bergeson commented 4 years ago

stepper.zip

andi-bergeson commented 4 years ago

stepnotebook.zip

parejkoj commented 4 years ago

Hmm... This brings up the useful point that I did not do a good job specifying how much noise there should be (I was thinking on the level of 5-10%, whereas this is more like 50% or more). On the other hand, lets see what @maria8ch can do with it!

andi-bergeson commented 4 years ago

here ya go. noise on the level of 10.2% :) lessnoisy.zip

parejkoj commented 4 years ago

@andi-bergeson : In your notebook, please overplot your "truth" data (without noise) at the end. Then you and @maria8ch should meet and discuss why we weren't able to get a good fit to it (here's a hint: the data you provided may not be the data you think it is).

As an example, look at how this astropy example generates noisy data and figure out why it's different from what you did.

https://docs.astropy.org/en/stable/modeling/

Once you've sorted out what's going on with the data, try the fitting procedure again. For next week, we'll spend some time turning the fitting notebook into an importable module and write tests for it.

parejkoj commented 4 years ago

Note: something in my above statement may be a lie. It's up to you to figure out what.

andi-bergeson commented 4 years ago

alright well........................ your messages were very cryptic, but I think I understand a little better what's going on.

i think using np.arange for the seven x values wasn't the move. i changed it to the same as the x values for the 50 ( np.linspace(-5000,-5000,7) )

i plotted the clean function (both 50pts and 7pts), and they lined up perfectly.

It could be that you want me to have np.random.normal(0.0, 0.3, len(x)), instead of (0.0, width, len(x)) where width = 0.3 * np.std(y)...... but if the standard deviation of y represents points ~ 34% away from the mean, i can multiply by 30% to get 10.2% error margins. so. blah.

ah. there could be an issue with setting the mean = 0.0. but i wasn't sure what to do there..... the mean would be pulled up by both sides of the parabola. when i did random.normal(np.mean(y), width, lenx), it looked horrible. out of everything i tried, it looked best with mean=0.0

here's my brain garbage! i'm not going to proofread! oh and heres my journal corresponding to the "newstepperseven" and "newstepperfifty" files! xoxoxoxoxoxox

goodboysgood.zip

maria8ch commented 4 years ago

John and I figured out what the trouble was! our fitting boys were not asking for the domain of our data, it was trying to scale things. Fun! but wrong. so if we just go like this:

t_init = models.Polynomial1D(2)
fitter = fitting.LinearLSQFitter()
t_fit50 = fitter(t_init,data50['x'],data50['y'])
t_true = models.Polynomial1D(2)
t_true.parameters = [75,4400,6]

plt.scatter(data50['x'],data50['y'])
plt.plot(data50['x'],t_fit50(data50['x']), label = 'fit')
plt.plot(data50['x'],t_true(data50['x']), label = 'true')
plt.grid()
plt.legend();

we get our fit to cooperate, as well as what @andi-bergeson did on the noisy data. it turns out not so bad at all! the graph shows nice things like this- image very cool!

parejkoj commented 4 years ago

@maria8ch : your new task is to turn your "fit and find the vertex" code into a documented function in your notebook, and use that new function to fit all of the different data above (with 7 and 50 points, and more and less noise) and write text about how you think the fitter is doing at finding the vertex. We'll make that into a module and write some test code using the above data next time.

parejkoj commented 4 years ago

@andi-bergeson : next time we meet, lets talk about what "standard deviation" means in this context. The standard deviation of a bunch of "exactly parabolic" data is not the thing that we want to use to generate errors.

parejkoj commented 4 years ago

I filed an astropy issue about the lack of documentation of the domain parameter: https://github.com/astropy/astropy/issues/9942

parejkoj commented 4 years ago

@andi-bergeson : after we talked, I realized that the fact that the distribution of errors on your previous data was not Gaussian (per our conversation), @maria8ch's fits won't be as good as we might expect. If you can regenerate the data like we talked about, @maria8ch can try to run the new fitting function on it and do a new comparison.

parejkoj commented 4 years ago

@andi-bergeson and @maria8ch : Maria and I wrote the first tests for the vertex finder. We'll want one more test using the more appropriate values for the telescope (FWHM between 2-7" in a focus range of [-5000,5000]), once you figure them out. I'll leave it to the two of you to add that test and make a plot of the results (you can put the plot in a notebook).

@maria8ch : please make a pull request of your ticket, and I'll review it after you've done the above. You can just attach a PNG/JPEG of the final plot to the PR.

parejkoj commented 4 years ago

To run the tests locally yourself, or to allow you to import the fitter in a notebook, run this command:

python setup.py develop
andi-bergeson commented 4 years ago

This should be my last bundle of files for generating random data.

First file is my notebook "makesomenoise.ipynb" - which might be useful in other contexts! By changing the function at the beginning (in mine it's a parabola), you can generate test data with gaussian (or uniform, or anything you want) noise. Could be useful if, for example, you've developed an algorithm which takes observational data like a variable star's light curve and fits a function to it. You can mess with the noise & binning of the star's known light curve function to generate data to test your algorithm, in order to determine how large the sigma of different types noise can be before screwing over your fitting algorithm.

The second file "data.csv" contains seven points shifted by Gaussian noise, in the correct interval.

The third file "true.csv" contains 200 pts mapping out the true function - to visually compare it to whatever line you fit to the messy points.

nb_pluspts.zip

ojf commented 4 years ago

Sweet, something to torment folks in 481. =)

maria8ch commented 4 years ago

Okay, so I finally got it. I put @andi-bergeson 's notebook and files into the appropriate branches in here (data went into data, makesomenoise.ipynb went into notebooks). I ran into this error when trying to do python setup.py develop: can't open file 'setup.py': [Errno 2] No such file or directory I think i'm supposed to put the python module name instead of setup.py? I think? Not sure so I'm asking.