Closed info-rchitect closed 7 years ago
Can't you directly use the class method from_csv
inside initialize
? It is not possible to include CSV reading from DataFrame#intialize
since the method is already performing too many functions and adding more functionality like this would add confusion to the API.
To be honest, I am not sure that you are doing the "right" thing from architectural point of view. Daru::DataFrame
will hardly make a good base class for anything except "dataframe with a few more features", it was never the point of the design.
Hi,
Thanks for the inputs. Regarding using Daru::DataFrame as a base class, it is needed so "business logic" and methods can be applied in a standardized manner. A simple example is creating a JMP like 'split' method using the pivot_table method or doing proprietary calculations on the dataframe. I just ended up allowing users to pass in a source argument that can be any type of supported files, a hash, an array of arrays, etc. Loving the library and I hope to contribute soon.
regards
Regarding using Daru::DataFrame as a base class, it is needed so "business logic" and methods can be applied in a standardized manner.
I am still not 100% sure about use case, but probably one of your options is having DataFrame
as an instance variable of your object, and use Forwardable
module to delegate some of methods to it.
DataFrame
itself is not ready for being base class, because, for example any of its methods that return DataFrame
, will still return them, not an instance of child class (and Vector
, not some descendant class).
So, may I close the issue?..
yes you may close the issue, thanks for the discussion. My app still uses DataFrame as a base class, I just intercept the results and manage what to do with the resulting DF. There is some performance overhead, but most of the time I am doing lots of transforms (pivots, concats, joins, etc.) and only need to convert the last DF into my object.
Hi,
I want to inherit from Daru::DataFrame to create a generic Dataset class. The limitation (could be my knowledge for sure) is currently that I can't seem to figure out how to pass options via super to ask the DataFrame to instantiate via the 'from_csv' or 'row' methods.
I can workaround this by forcing users to instantiate Datasets via a wrapper method but having this ability via super would make the code so much cleaner.
thx