As a continued efforts of code refactoring to improve the systems extensibility, it comes to our attention that we need to remove the system design built around just int and string. Many codes in the system are hardcoded and contains casting to only support int and string.
It will pave the way to support float.
idea
Have an abstract value type that moves across different operators in cool.
Have selected types at the source and end to handle the different data types.
solution
general FieldValue type with two subtypes. The two types corresponds to the other upper-level apparatus in COOL, MetaFieldRS/WS, DataFieldRS/WS, filters, etc.
RangeField: numeric values (or maybe more general a total order set)
HashField; with a hash function that transform to int.,
Additional IntRangeField, for explicit intent in storing the HashField converted integers (gids) in data chunks.
InputVector made generic and return FieldValue
filter and aggregator operates on FieldValue.
explicit logic on specific types are kept at InputVectorFactory, filters and aggregators, invalid arguments handled by exceptions
issues addressed
common method abstraction to avoid value type casting to int and string
removed most InputVector related casting to use instead the more specific extended interfaces to communicate our intents in handling HashField, RangeField, IntField, and internal integers.
outcome
better code readability and extensibility.
new issues caused
Issue#124 the CoolTupleReader and KeyFieldIterator needs to rework to use new APIs.
possible changes
specific abstraction for user key field and action time field
description
As a continued efforts of code refactoring to improve the systems extensibility, it comes to our attention that we need to remove the system design built around just int and string. Many codes in the system are hardcoded and contains casting to only support int and string.
It will pave the way to support float.
idea
Have an abstract value type that moves across different operators in cool. Have selected types at the source and end to handle the different data types.
solution
issues addressed
outcome
better code readability and extensibility.
new issues caused
possible changes