ostafen / clover

A lightweight document-oriented NoSQL database written in pure Golang.
MIT License
680 stars 55 forks source link

Gob encoding and internal data types #46

Closed ostafen closed 2 years ago

ostafen commented 2 years ago

Discussed in https://github.com/ostafen/clover/discussions/41

Originally posted by **ostafen** May 1, 2022 Hi, everyone, I created this discussion to collect opinions and suggestions, since this is a very sensitive topic. Currently, **CloverDB** serializes documents to json before storing them on disk. This has been done because of the fact that, early versions of the library used ".json" files directly to store data. But since Clover evolved since that time (it now uses the **badger** kv-store), this solution is no more acceptable for the following reasons: - instances of the time.Time struct cannot be correctly recovered, because they are converted to string when serialized and, as a consequence, `json.Unmarshal()` deserializes them to normal stings. This affects queries involving dates or times (unless you decide to store them as a timestamp during document insertion). - All numbers are silently converted to float64. To fix these issues, I was thinking to switch to the **gob** encoding, which preserves the correct type for each document field. This open a new question about internal data types: Which numeric types should be supported by clover? Should we preserve all of the types (int, uint8/int8, uint16/int16... and so on) or should we restrict types (using int64 for **integer** numbers and **float64** for double numbers, for example). What do you think about this?