I provided new parsers.py module which implement different (de)serialization methods, e.g. cjson, json, json_stream, yaml
the new module allows to write custom serialization of input data streams, e.g. we can write custom C-module to optimize serialization of incoming data
I provided test example to tests different formats, e.g.
# use cjson format, and run tests 3 times
Server/Python/src/dbs/utils/parsers.py --fin=blocks.json --format=cjson --times=3
# use cjson format, and run tests 3 times
Server/Python/src/dbs/utils/parsers.py --fin=blocks.json --format=json --times=3
# use json_stream format, and run tests 3 times
Server/Python/src/dbs/utils/parsers.py --fin=blocks.json_stream --format=json_stream --times=3
# use yaml format, and run tests 3 times
Server/Python/src/dbs/utils/parsers.py --fin=blocks.yaml --format=yaml --times=3
Due to dynamic nature of python memory allocation it is hard to evaluate an impact of particular format on long running DBSServer, but this PR will allow to easily switch and tests usage of different formats. But to do that the clients which will interact with DBS server will need to send data in proper format, e.g. in json_stream, such that we can measure memory footprint of DBS server in that case.
The provided convert2json_stream function allows to convert either given json (dict) object or file object which contains json data stream, e.g.
# example how to convert json to json_stream
from dbs.utils.parsers import convert2json_stream
import json
data={"data":1, "foo":[1,2,3]}
convert2json_stream(data)
# this will produce the following output
{
"foo"
:
[1
, 2
, 3
]
,
"data"
:
1
}
# if you want to write this output to output file you will do
obj= open('YOUR_FILE_NAME', 'w')
convert2json_stream(data, obj)
# similar if you do have file object which contains json stream you may use it
fobj = open('YOUR_FILE.json')
convert2json_stream(fobj)
# similarly I provide convert2yaml function which can convert given JSON to YAML
data={"data":1, "foo":[1,2,3]}
print(convert2json_stream(data))
data: 1
foo:
- 1
- 2
- 3
With this module we can perform various test on DBS server using different input data formats.
This PR tries to address issues with large memory footprint in DBS server, see full discussion in https://github.com/dmwm/DBS/issues/599
The code is refactored in the following way:
Due to dynamic nature of python memory allocation it is hard to evaluate an impact of particular format on long running DBSServer, but this PR will allow to easily switch and tests usage of different formats. But to do that the clients which will interact with DBS server will need to send data in proper format, e.g. in json_stream, such that we can measure memory footprint of DBS server in that case.
The provided
convert2json_stream
function allows to convert either given json (dict) object or file object which contains json data stream, e.g.With this module we can perform various test on DBS server using different input data formats.