Currently, the formats package contains a hierarchy of files do handle the various formats that libcchdo can read and write to. The formats package is scanned for modules (.py files in directories with __init__.py) containing methods titled read or write or both.
The read and write methods themselves do not have a consistent interface. They all accept at least two positional arguments the first being either a DataFile or DataFileColection. The second argument being a file opened for reading. Any subsequent arguments are usually (always?) limited to keyword arguments for non-standard information the reader/writer may need. The modules may also define methods for recognizing the format it reads/writes and possible file extensions of the format.
To summarize, what appears to define a format reader or writer is:
Is in the formats module of libcchdo
Has methods named read or write or both
Proposal
Semi-Simple
The read and write methods should be unified in the expected argument data types. The most apparent choice being to always expect a DataFileCollection and an appropriate opened file. Current methods that only expect a single DataFile would need to be modified to deal with a DataFileCollection of length 1 (or more).
Complex Overhaul
Another possible solution would be to change the formats modules to class based ones. Each format would then be any class the inherits from some BaseFormat class that defines the interface. This could possibly allow other methods such as pre read/write hooks and format recognizers (file extension, headers, attempt to actually read, etc...).
Potential Issues
One potential issue involves the open file of appropriate type. Some formats may want to write multiple files. It may be better to pass the methods a path and then let each read/write method deal with that path however it needs. For example, a writer method may take a path and then create a directory and fill it with files of its own naming scheme rather than just open a file for writing.
Request for Comments
What other potential issues might be encountered if either of these potential solutions are implemented?
Is there any reason to chose one solution vs the other (other than the amount of work required)?
Is there some other solution that this author hasn't thought of?
Original report by abarna (Bitbucket: abarna, GitHub: abarna).
Current State
Currently, the formats package contains a hierarchy of files do handle the various formats that libcchdo can read and write to. The formats package is scanned for modules (.py files in directories with
__init__.py
) containing methods titled read or write or both.The read and write methods themselves do not have a consistent interface. They all accept at least two positional arguments the first being either a DataFile or DataFileColection. The second argument being a file opened for reading. Any subsequent arguments are usually (always?) limited to keyword arguments for non-standard information the reader/writer may need. The modules may also define methods for recognizing the format it reads/writes and possible file extensions of the format.
To summarize, what appears to define a format reader or writer is:
read
orwrite
or bothProposal
Semi-Simple
The read and write methods should be unified in the expected argument data types. The most apparent choice being to always expect a DataFileCollection and an appropriate opened file. Current methods that only expect a single DataFile would need to be modified to deal with a DataFileCollection of length 1 (or more).
Complex Overhaul
Another possible solution would be to change the formats modules to class based ones. Each format would then be any class the inherits from some
BaseFormat
class that defines the interface. This could possibly allow other methods such as pre read/write hooks and format recognizers (file extension, headers, attempt to actually read, etc...).Potential Issues
path
and then let each read/write method deal with that path however it needs. For example, a writer method may take a path and then create a directory and fill it with files of its own naming scheme rather than just open a file for writing.Request for Comments