MHKiT-Software / MHKiT-Python

MHKiT-Python provides the marine renewable energy (MRE) community tools for data processing, visualization, quality control, resource assessment, and device performance.
https://mhkit-software.github.io/MHKiT/
BSD 3-Clause "New" or "Revised" License
47 stars 45 forks source link

Add pd.DataFrame as allowed input to functions #34

Closed rpauly18 closed 1 year ago

rpauly18 commented 4 years ago

pd.DataFrames should be allowed as input for functions due to the fact that DataFrames are the output for many functions.

First example of this issue is capture_length only allows for nd.array or pd.Series, while the energy_flux function returns a pd.DataFrame. The output of energy_flux is the input for capture_length. Other functions should be checked for the same issue.

Functions to be changed: capture_length capture_length_matrix wave_energy_flux_matrix

ssolson commented 4 years ago

Hey Rebecca. As a recap, I had DataFrames as inputs to functions in the River module. This implementation was rejected due to the need to have reserved column names.

If I understand correctly what you are proposing would have the same issue e.g. the user would need to specify what column names to use in the DataFrame in a dictionary or list or have a reserved name that the function expects when a DataFrame is passed. If so I would not be in favor of this implementation.

I think it is okay to return a DataFrame with a column name specified by MHKiT because this can help with consistency in naming for MHKiT computed quantities. However, if you would prefer to return a series that could be okay with me as well. Let me know what you think and if I understand your proposal correctly.

rpauly18 commented 4 years ago

I think that would only be an issue if column names are being used for something in the function. If you are just applying some math for the df, it should not matter. Additionally, if you are passing two df, and you need to apply some math between them, we can change the column names to match within the function. We are implementing this in ac_power_three_phase and dc_power.

My main reason for this is to allow for the dataFrames that MHKiT creates to be able to be directly passed back into another MHKiT function without the user having to convert it to a pd.Series.

ssolson commented 4 years ago

Okay, so I think you are saying a user should be able to pass a DataFrame with a single column correct? Then, of course, the column name would not need to used and we could just grab the first column.

I think this may warrant further discussion on if we should only return series. Because I believe this would solve your problem as well.

rpauly18 commented 4 years ago

A while back Kate had begun working on methods to let MHKiT-Python know that it was MATLAB that was running it, which would allow for the Python functions to then be able to return Series if MATLAB is not running it. We should revisit those methods.