Closed csadorf closed 5 years ago
Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).
Yes, that is a deliberate design decision.
Original comment by Bradley Dice (Bitbucket: bdice, GitHub: bdice).
I think that Collection.to_json()
is a good idea. Should it return a string, or should it write to a provided filename
or file_object
argument? It should probably have a corresponding from_json
method that works sort of like open
.
Thanks for explaining that design decision. It makes more sense now.
Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).
The Collection.to_json()
method should return a string. The corresponding Collection.from_json()
should be a classmethod
, accept a string-argument, and return an instance of Collection
.
I don't think providing a write()
method or similar that writes directly to a file-object makes much sense, because we will need to create the string anyways. It is one of the weaknesses of the JSON-format that we can't really stream it well.
Original comment by Bradley Dice (Bitbucket: bdice, GitHub: bdice).
@csadorf I may take it on soonish, for my needs in the photonics project. I would strongly prefer the ability to load/save directly to/from files, mirroring the style of pandas
: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html
There has been some debate on how exactly to implement this. The latest decision was to replicate the pandas.DataFrame
API with respect to JSON export and import as closely as possible. I think that's a good idea, but please feel free to suggest a deviation if you think you have a compelling reason.
@bdice are you still interested in taking this on?
Sorry for the delay, have been busy with prelim prep and wrapping things up before Heather's departure. I am planning to spend this Thursday working on it, but will hand it over if it the need is more urgent.
On Tue, May 28, 2019 at 10:32 AM vyasr notifications@github.com wrote:
@bdice https://github.com/bdice are you still interested in taking this on?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/glotzerlab/signac/issues/104?email_source=notifications&email_token=AFPT3TNDVIE6SUO4ENU6PITPXU66HA5CNFSM4G6RLZDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMKFQI#issuecomment-496542401, or mute the thread https://github.com/notifications/unsubscribe-auth/AFPT3TMRMKGS7HGZAPCDA3DPXU66HANCNFSM4G6RLZDA .
-- University of Michigan, Ann Arbor Ph.D. Candidate - Mayes, Glotzer Labs College of Engineering - Chemical Engineering
@xadams That would be awesome - I did some reading on the pandas DataFrame
API and this is what we would shoot for (this is approximate / pseudocode):
from .core import json
# In the Collection class, add this method
def to_json(self, file=None):
json_string = json.dumps([i for i in collection.find()])
if file is None:
# return as a JSON string -
return json_string
if isinstance(file, six.string_types):
# opens the file as a file path
with open(file, 'w') as json_file:
# write to that file path
json_file.write(json_string)
else:
file.write(json_string)
@classmethod
def read_json(cls, file=None):
if isinstance(file, six.string_types):
with open(file, 'r') as json_file:
json_data = json.load(json_file)
# Instantiate and return Collection.
else:
json_data = json.load(file)
# Instantiate and return Collection.
Resolved with merge of PR #200 .
Original report by Bradley Dice (Bitbucket: bdice, GitHub: bdice).
Currently collections don't output valid JSON files, just "very close to valid JSON files." Collections use a text file of JSON dictionaries separated by newlines, while a valid JSON file would only need
len(collection)+2
more bytes for square braces and commas.I'm not sure why this design decision was made, but it seems that there would be a clear advantage to using a valid JSON file (so long as there is no serious performance penalty).
It would be nice if there was a better way to use Collections with a JSON backend. Currently I have a workaround to save a JSON file:
json.dump([i for i in collection.find()], open('valid.json', 'w'))