CARSv2 / cars-v2

CARSv2 project repository - public
MIT License
4 stars 1 forks source link

Creation of functions to read non-WOD files #9

Open BecCowley opened 1 year ago

BecCowley commented 1 year ago

I need some clarity on the output from reading non-WOD files.

My current thoughts is to write a function for each file type and output the same structure (perhaps a dataframe) for each. Then any notebook could call the function to access the data ready for input into the mapping.

Question: are we creating our tools as notebooks as a standard, or ultimately producing *.py functions/definitions/tools?

Some notes:

Thomas-Moore-Creative commented 1 year ago

Question: are we creating our tools as notebooks as a standard, or ultimately producing *.py functions/definitions/tools?

@BecCowley - I'm not clear on what @ChrisC28 thinks here but in general I use notebooks for all my code development. As a final product notebooks are useful for code workflow documentation, examples, and sharing results.

For any often repeated tasks I usually take a simple approach of defining python functions - first right in the notebooks themselves and then collecting them in a local tools.py file (or files) that I can import as needed.

I'm not an expert on making proper python packages but I'd suggest that as we all build discrete functions the easiest way is to place mature functions into a shared tools.py file - or whatever is equivalent for julia?

BecCowley commented 1 year ago

@Thomas-Moore-Creative, thanks for this information. I'm happy to make notebooks, I haven't done it much before and hence my question about how they function together. I will follow your advice! I don't know how tools.py files work, I'll see what I can find out, but would be happy to hear you explain it to me.

Here are some notes from @ChrisC28 via email:

Here's my current (basic) workflow with the WOD data in ragged array format: For each platform type (CTD, PFL, XBT,....) and for a given variable (say Temperature) use a bit of magic to tag every obs value with a profile index; Filter out the bad profiles (note, I haven't done this yet... it's on the to-do list); For each profile, use the fancy TEOS10 vertical interpolation to put the profile on a set of a standard levels (at the moment, just every 10m or something); Store the interpolated profile in a netcdf file with the dimensions (cast,depth); The above is a sketch, but you get the idea.

So, the dilemma I have is do we put the non-WOD data : in WOD ragged array format; or save the profiles directly (ie with dimensions (cast, depth) I'm leading towards the WOD format - it's what the data assimilation people tend to use, and it allows me to run the data through the exact same processing routines that we use for the WOD data.

As such, here's what I propose: A reader for each file type that takes (for example) the AIMS data, processes it following your magic (adds QC flags where required, etc...) spits it out into WOD ragged array format.

Thomas-Moore-Creative commented 1 year ago

@Thomas-Moore-Creative, thanks for this information. I'm happy to make notebooks, I haven't done it much before and hence my question about how they function together. I will follow your advice! I don't know how tools.py files work, I'll see what I can find out, but would be happy to hear you explain it to me.

Again, I'll note that my approaches might not be best-practice but suggesting you start testing functions in notebooks then once you are confident about a function you can put it into a my_functions.py file for general import into any notebook or python code? This below might help more?

You can define a local function in your Python Jupyter notebook by simply defining the function in a code cell. The function will then be available for use in subsequent cells. To call the function, simply include its name followed by parentheses and any required arguments. For example:

def my_function(arg1, arg2):
    # do something
    return result

my_function(value1, value2)

Note that local functions are only available within the same notebook where they are defined.

To import functions from a local file, you can use the import statement followed by the name of the file (without the .py extension) and the name of the function. For example, if you have a file my_functions.py that contains a function my_function, you can import it using:

from my_functions import my_function

Then you can call the function using my_function() in your code.

BecCowley commented 1 year ago

@Thomas-Moore-Creative, thanks. Fairly straightforward, then!

BecCowley commented 1 year ago

List of issues that @BecCowley ran into while doing the conversion of AIMS csv CTD files from CSV to NETCDF format:

BecCowley commented 1 year ago

@ChrisC28 I have converted the MNF CTD data from the CARS region. Location in /oa-decadal-climate/work/observations/CARSv2_ancillary/MNF/NC

Notes:

Will push my code to the repository with the other converters. They are now in the src/features folder, not in the notebooks.

ChrisC28 commented 1 year ago

@BecCowley Awesome! Thanks for that. I'll try to get them into "the system" this week or next.

Paul Sandry is interested in having those data availble for the ROAM data assimilation system.