OceanGNS / PGPT

A python-based processing tool to turn slocum glider binary files to self-describing US IOOS GDAC v3.0 netcdf files
3 stars 1 forks source link

Number of profiles incorrect #7

Open mackenziemeier86 opened 2 months ago

mackenziemeier86 commented 2 months ago

Hi, I am getting the following output when I run PGPT on my data. We're only getting one profile per file when there should be more. The generated .nc files also seem to correspond to the DBD/EBD file but not the profile. Any help would be greatly appreciated.

./run.sh -g unit_689 -d /Users/mackenzie/Desktop/PGPT-main -m metadata.yml -p delayed

usage: date [-jnRu] [-I[date|hours|minutes|seconds]] [-f input_fmt] [-r filename|seconds] [-v[+|-]val[y|m|w|d|H|M|S]] [[[[mm]dd]HH]MM[[cc]yy][.SS] | new_date] [+output_fmt] /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:268: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC). now = (datetime.utcnow()).strftime("%FT%TZ") /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:245: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] > 2000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:246: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] < 1000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:245: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] > 2000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:246: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] < 1000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:245: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] > 2000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:246: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] < 1000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:245: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] > 2000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:246: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] < 1000000000] = np.nan # T Remove bad times profile_index not present in the data file profile_direction not present in the data file oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file profile_index not present in the data file profile_direction not present in the data file oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file profile_index not present in the data file profile_direction not present in the data file oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file profile_index not present in the data file profile_direction not present in the data file oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:245: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] > 2000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:246: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] < 1000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:245: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] > 2000000000] = np.nan # T Remove bad times /Users/mackenzie/Desktop/PGPT-main/scripts/bd2nc.py:246: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0! You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy. A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use df.loc[row_indexer, "col"] = values instead, to perform the assignment in a single step and ensure this keeps updating the original df.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

data['time'][data['time'] < 1000000000] = np.nan # T Remove bad times profile_index not present in the data file profile_direction not present in the data file oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file profile_index not present in the data file profile_direction not present in the data file oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file (2,) (2,) (2,) oxygen_sensor_temperature not present in the data file oxygen_concentration not present in the data file cdom not present in the data file

tb4764 commented 2 months ago

Can you share your files (or a subset of your files) so I can regenerate the NC files?

nbronikowski commented 2 weeks ago

@mackenziemeier86 - are you still having issues? Can you share a subset of your files you are trying to pocess so we can try it out?