Improve behaviour of unloaded symbols. Probably needs a couple of iterations of feedback. Breaks backwards compatibility on GdxSymbol.__init__ so probably worth including this in v2 instead of v1.3. This is somewhat faster and makes it easier to work with unloaded symbols.
Changes
Add constructor functions on GdxSymbol for creating from a dataframe and creating from a file. These should be used instead of __init__ in most use cases.
GdxSymbol.__init__ now always returns an unloaded symbol. To read metadata from the file (previous behaviour) use the from_gdx constructor.
Unloaded symbols have .dataframe=None, not an empty dataframe. Remove init_dataframe as no longer needed.
This should improve handling of (correctly) empty symbols
Also means that gdxpds.to_dataframe on a gdx with many symbols is much faster (20.439 -> 11.088 seconds with 1024 symbols) as we don't need to initialise lots of blank dataframes
Move infer_data_type out of the Translator class. Seems like a generally useful util (and is used in GdxSymbol.from_dataframe). Possibly shouldn't be in gdx.py as this file is too large already
GdxSymbol.__init__ defaults to dims=None, not dims=0. This may improve handling of scalars in future.
Permit setting dims=None, if dims is already None.
If dims is unset (None), num_dims should be None not 0. This may improve handling of scalars in future.
.loaded is dynamically calculated from .dataframe is not None. This prevents possible bugs where ._loaded is incorrect.
Trying to read an unloaded dataframe raises an error instead of returning an empty dataframe.
Explicit error is good, breaking compatibility is bad. This is better than silently loading an empty dataframe IMO.
Improve behaviour of unloaded symbols. Probably needs a couple of iterations of feedback. Breaks backwards compatibility on
GdxSymbol.__init__
so probably worth including this in v2 instead of v1.3. This is somewhat faster and makes it easier to work with unloaded symbols.Changes
__init__
in most use cases.GdxSymbol.__init__
now always returns an unloaded symbol. To read metadata from the file (previous behaviour) use thefrom_gdx
constructor..dataframe=None
, not an empty dataframe. Removeinit_dataframe
as no longer needed.gdxpds.to_dataframe
on a gdx with many symbols is much faster (20.439 -> 11.088 seconds with 1024 symbols) as we don't need to initialise lots of blank dataframesGdxSymbol.from_dataframe
). Possibly shouldn't be in gdx.py as this file is too large alreadyGdxSymbol.__init__
defaults todims=None
, notdims=0
. This may improve handling of scalars in future..dataframe is not None
. This prevents possible bugs where ._loaded is incorrect.Tests