Open cjcodeproj opened 4 months ago
Code should work something like this.
formatter = FormatterClass.new(Style.TEXT)
formatter.format(Output.LIST)
output = formatter.headers()
output += formatter.format_batch(in_batch)
output += formatter.close()
print(output)
There could be other methods, but the idea is the code could output either plain text or HTML without any code change at the application level. The code should also automatically change it's behavior based on the type of content that is being output.
Based on the code in media.tools.movies.list
and media.tools.audio.albums.list
Output Format | Object Class | Report Type |
---|---|---|
text | Movie | List Entry |
html | Album | Mini List Entry |
markdown | Song | Detailed Entry |
Essay |
The trick will be to make the code extensible to handle all types of content, flexible to handle all output formats, but also still support a polymorphic interface.
The different output formats will probably be handled by some kind of Driver level class, where the formatter references the driver, and each driver adheres to a similar protocol for output calls.
The output format driver should be the dumbest code of them all. No knowledge of the data it's outputting, just helper functions that are designed to pad and alter the values into suitable output.
We're taking things for granted, like alignment/justification, and column widths (which just magically work when you're dealing with tables). Also things like plain text output using fixed width fonts, which will become an issue when formats like HTML come into play.
The driver building a table should take two parameters
There are 3 output formats supported in the test code: plain text, CSV, and HTML. Each driver outputs a table with a list of movies.
There are pros and cons of each system. Plain text has a fixed table size, where a HTML table adapts to the field length. Field width is determined by the number of characters, but HTML has options for widths using values like inches, ems, pixels, etc, etc. But on the other hand, HTML doesn't natively support an output of a fixed width floating point value.
There should be a mechanism where drivers have feature flags that can identify things that the driver is capable of.
There should also be features to do things like indent the table output by 10 character positions.
Tables where rows are grouped into batches should make use of the HTML <tbody>
element. The add_row()
method should probably be supplemented by a add_row_batch()
method to accommodate this.
The table should have at least one <tbody>
element, regardless of the structure of the rows, so the code will need to keep track of the rows as they are added. It should probably be a simple method that counts the rows as they are added.
Right now all of the drivers have an identical class and method structure, but there is almost no shared code between them. If the code base was Objective-C, Swift, or Java the classes would all adhere to a protocol specification.
Protocols don't come into play in Python unless you're also doing typing. It should be a future consideration to create protocol definitions when typing is implemented.
Test code implementation currently looks like this:
from media.fmt.driver.generic import TableColumnSpec, TableColumnAlign
from media.fmt.driver.selector import Selector
driver = Selector.load_driver('text')
table = driver.get_table()
table.add_column(TableColumnSpec('Title',20))
table.add_column(TableColumnSpec('Length',10))
table.add_column(TableColumnSpec('Genre',15))
table.start()
table.headers()
table.add_row('Condorman','1:00:00','Action')
table.add_row('Snake Movie','1:01:00','Drama')
table.add_subhead('New movies')
table.add_row('Catch The Last Train','2:00:00','Western')
table.add_row('Saddlebag Full Of Bullets','1:30:00','Western')
table.finish()
print(table.output,end='')
There are output mechanisms for plain text, HTML, and CSV formats. The markdown format was dropped because Markdown can support embedded HTML, and the syntax isn't flexible enough to handle some use cases. In this code example, changing the output format only requires a change to a single line of code.
There are 3 layers to the code.
Right now the driver layer caches the data, and returns it all as a single string. But should it? Or should the middle layer capture all the data and return it as a single string.
If we want to keep the drivers simple, then the output should be preserved at the middle layer. Does the driver layer need to maintain state? It could be helpful to track the number of rows, but not 100% sure if it's needed.
On the other hand, rendering a HTML table, there is a need to track column and row information when it comes to things like cell or header id. Also, if the table ever has a <tfoot>
block, it's important that it follows the <thead>
, but precedes the <tbody>
, because it's a requirement for rendering. So, the driver needs to know what the entire table is like in order to get those elements output in the right order.
Coding Notes:
The following sample code (not committed) can generate a full HTML list of movies.
#!/usr/bin/env python
# Test program to output a list
import media.fileops.repo
from media.generic.sorting.organizer import Organizer
from media.generic.sorting.batch import Batch
from media.fmt.content.movie.list import TableList
repo = media.fileops.repo.Repo('/home/chrisj/xml/m/internal-db')
repo.scan()
repo.load()
print(f"<!-- {len(repo.media)} -->")
movies = repo.get_movies()
organizer = Organizer(movies)
batches = organizer.create_batches(None)
print(f"<!-- {len(batches)} -->")
tl1 = TableList()
tl1.setup('html')
tl1.batch(batches[0])
print(tl1.get_output())
The middle layer object is the TableList
class which organizes the data, and then uses a driver class just for generating table HTML code.
Tables are objects, but they have no output formatting functionality.
There are separate formatters for plaintext, HTML, and csv output.
One table can be passed to multiple formatters.
Sample code
from media.fmt.structure.table import Table, TableColumnSpec
from media.fmt.formatter.selector import Selector
html_formatter = Selector.load_driver('html')
html_table = html_formatter.get_table()
pt_formatter = Selector.load_driver('plaintext')
pt_table = pt_formatter.get_table()
csv_table = Selector.load_driver('csv').get_table()
t = Table()
t.add_column(TableColumnSpec('Title',20))
t.add_column(TableColumnSpec('Length',10))
t.add_column(TableColumnSpec('Genre',15))
t.start()
t.add_row('Condorman','1:00:00','Action')
t.add_row('Generic Snake Movie','1:01:00','Drama')
t.add_body()
t.set_body_header('New movies')
t.add_row('Catch The Last Train','2:00:00','Western')
t.add_row('Saddlebags Full Of Danger','1:30:00','Western')
t.finish()
out = html_table.render(t)
out2 = pt_table.render(t)
print(out)
print(out2)
print(csv_table.render(t))
Classes under the structure
package contain the data. Classes under the formatter
package handle the output.
For the tools that generate output reports, most of the processing is based on loading and sorting the data. The output part is easy, and most of that code is in other modules.
Consider an API framework that abstracted the output even further; so a tool like
media.tools.movies.list
could handle multiple output formats with very little change to the code base.