Closed sbillinge closed 8 months ago
just to make sure I get this right
load_jsoncollection loads a collection downloaded from MP [{'id': ID1, 'first': val1}, {'id': ID2, 'first': val2}] into something fsclient/pymongo can deal with (has 'id' as keys) -> {ID1: {'id': ID1, 'first': val1}, ID2: {'id': ID2, 'first': val2}}
and dump_jsoncollection does this in reverse? (the json output file has one entry {'id': ID1, 'first': val1} per line)
just to make sure I get this right
* load_json_collection loads a collection downloaded from MP [{'id_': ID1, 'first': val1}, {'id_': ID2, 'first': val2}] into something fsclient/pymongo can deal with (has 'id_' as keys) -> {ID1: {'id_': ID1, 'first': val1}, ID2: {'id_': ID2, 'first': val2}} * and dump_json_collection does this in reverse? (the json output file has one entry {'id_': ID1, 'first': val1} per line)
not quite. MP yields pure json in its payload, which is not in the form of a collection (a list of docs). The function load_json()
in io.py
should be able to read the MP payload (I didn't check).
We will then dump this to a collection in a database. This "database" will be, by default, a bunch of either yml or "json-collection" files in the directory ../db
with respect to the ml4msrc.json
file where we run the script. It sounds a bit complicated but you get used to it quickly.
Maybe I will try and run through this process on my local using the json files you sent earlier.
Tests are passing so I think this can be merged. It means that we should be able to read the json that came from matplotlib into the standard style that we discussed.