Opening non-pickled results

mmaelicke commented 4 years ago

Hey there,

First thanks for this package! I have to make clear first, that I am not at all an Eddy-Guy. More like the Python-guy-trying-to-write-some-code-for-Eddy-folks. I had a look at the init of FluxpartResult and tried to implement an alternative to read from CSV instead of pickled Python objects (not a good option for me as I change hardware in the workflow).

Now I am realizing that, as you are writing in the docs, the FluxpartResult.df is multi-indexed and I am not able to reverse-engineer it. I have a sample file that I want/need to open, which I believed is a result from flux partitioning. Maybe I am wrong.

Now my question: Do you have any integrity checks that I could run against such a file, or can I somehow reverse-engineer the column names into your top-level keys like fluxes and so on? Then I could use your Class...

What I have done quick'n dirty:

class FluxpartResult(object):
    def __init__(self, fp_results, **kwargs):
        if type(fp_results) is str:
            if fp_results.endswith('.pkl'):
                self.df = pd.read_pickle(fp_results)
                # with open(fp_results, "rb") as f:
                #     self.meta = pickle.load(f)
                return
            else:
                self.df = pd.read_csv(fp_results, delimiter='\s+')
                return
        index = pd.DatetimeIndex(r.label for r in fp_results)

Which works, but I obviously run into the indexing problem once I use the objct. At the end, I would have implemented FluxpartResult.verify or something like that to make sure it's a valid file. That's what I finally need.

Is something like this helpful and worth it, or would you suggest that I build a test like this outside of fluxpart? Would be great to hear your comment. Best

Mirko

thskaggs commented 4 years ago

Hi Mirko,

Thanks for the question/comment. I don't recognize your sample file. It's not related to fluxpart.

I have been intending to change the way the partitioning results are saved and reopened. It should be possible to save and open using formats other than pickle. This is a busy week for me, so it will be a few days until I can get to this.

Todd

thskaggs commented 4 years ago

Update to ver 0.2.10,

conda update -c ussl fluxpart

Partitioning results can now be saved and reloaded using the csv format for the dataframe.

Save:

from fluxpart import fvs_partition
fvsp = fvs_partition("high_freq_datafile", etc ...)
fvsp.save_csv("results.csv")

Reload results:

from fluxpart import fpread
fvsp = fpread("results.csv")

mmaelicke commented 4 years ago

Great! Thanks for the quick implementation.

usda-ars-ussl / fluxpart

Opening non-pickled results #6