SuperDARNCanada / fitacf.3.0

The repo for the new and improved fitacf routine
GNU General Public License v3.0
0 stars 1 forks source link

fitacf.3.0

THIS PROJECT HAS BEEN MOVED INTO RST(https://github.com/SuperDARN/rst). THIS REPO IS NOW DEPRECATED!!!

Building the project

Requires a version of RST to be set up with correct environment variables. Aftering running make in the project directory, a version of make_fit will be placed in a bin directory within the project folder.

Details

fitacf.3.0 is a complete rewrite of the ACF fitting routine using in the RST software package. fitacf.3.0 attempts to improve on many aspects of the current version, fitacf.2.7 both in terms of algorithm correctness, and software design.

Algorithm description

TODO

Software design

fitacf.3.0 was designed to be easy to read,easy to test, and easy to modify. In contrast to the older versions, variables and function names, and files are more self descriptive so that it is easier to locate things and understand the code. Data is now better encapsulated so that it is clear what is being operated on at all times. And because data follows a better encapsulation scheme, it means that functions can be designed with greatly reduced coupling compared to the older versions.

The data structure used to contain data during the fitting procedure is a nested linked list. A linked list of range nodes are used to hold the data associated with each range for a particular scan. Range nodes hold within themselves quite a bit of information:

Power nodes contain a value for log power, error(sigma), and time. Phase and elevation nodes contain a value for phi, error(sigma), and time.

The reason linked lists work so well in this application is that although their implemenation is more complicated than arrays, it's much easier to add or remove elements without having to malloc new sections of contiguious memory and copy values over, or having to use extra arrays to keep track of good data. In this case, when the data structure is run through filtering stages, data can be completely trimmed from the data structure so that one can reliably know that all data left at the fitting and determination stage is good data. Data is almost always worked on sequentially which lists are good at. One thing that may look confusing at first to those that are unfamiliar are the use of function pointers when applying operations to list nodes. This list is generalized, so function pointers are used as a callback. Callbacks are needed to know how to delete or iterate over a generalized list, for example. The list library has a llist_for_each method used to apply an operation to each node in the list. This method again uses function pointers to know what function to apply to each node. It is similar to how the map function works in Python. This foreach method often nested in fitacf.3.0 as we are using lists of lists. All fitting structures are defined in fit_structures.h

Using a structure like this means that functions no longer have to be coupled the way they were in older versions of fitacf. Instead of rippling changes from one function to the next, functions return to the top of the stack when they are finished operating on the data structure. This means that we can independently test, modify, or even disable pieces of the program without affecting the operation of further stages. As an example, filtering of transmitter pulse overlapped lags can be completely disabled if a researcher wanted to test something using simulated data. Making changes like this would be extremely difficult in the older versions.

Now that the data structure is explained, we can go into more detail of how this structure is operated on. The program goes through three main stages:

  1. Preprocessing
  2. Fitting
  3. Echo parameter determination

In the preprocessing stage, raw data is read in and the data structure lists and fields are filled using the raw data. Data then goes through a filtering process where low power lags or transmitter pulse overlapped lags are removed, and noise ranges are removed.

In the fitting stage, the power fitting is done first. After a power fit is complete, that data can be used to calculate sigma for the phase nodes and a phase unwrap can occur. Once the phase is unwrapped, a fit can be applied. The process of fitting for elevation is the same as fitting for phase.

The fitting routine uses the exact algorithms for 1 or 2 parameter straight line least squares fitting as described in NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING, and the fitting structure uses the exact same naming convention from this book. If you are looking more in depth to the least squares fitting, or if you would like to use any of the additional information calculated, but not used(coefficient of correlation, chi squared value, etc) then refer to this book.

In the echo parameter determination stage, the fitted values are then used to determine values for things such as velocity, elevation, power, and spectral width with their respective errors. These parameters are then written out to a file.

Testing

There are no designed unit tests to fitacf.3.0, but there are functions that can be used to log almost every data structure to a file so that you can follow along with what is happening in more detail. To use this, just call these print functions where you want, or pass as a callback to llist_for_each to log details of each node in a list. Samples of this are commented out in the top level. The testing code supplied was used in an earlier version of development until Pasha took over testing using his own tools. I will bring this code up to the current version of development as its still useful for testing individual pieces of fitacf.3.0.

You can also change the -g option in the makefile to -O3 for massive speed increases. -g just allows for stepping through the debugger.