Closed lmmx closed 3 years ago
The summarise_in_chunks
function in tap.precis
returns a tuple of two lists, (summaries, chunk_sizes)
The chunk_sizes
can be iterated over to give chunk ranges as:
chunk_seek = 0
for chunk_size in chunk_sizes:
chunk_range = range(chunk_seek, chunk_seek + chunk_size)
chunk_seek += chunk_size
chunk_seek
will then be the length of the list of documents (i.e. list of transcripts here)I’d like to add a method to Stream
:
def export_transcripts(format=None, out_dir=None, out_domain=None):
“””
If `format` is None (default: None) the transcripts are exported as plain text,
if `format` is “md” they are converted to GitHub-flavoured markdown (using
`<details>` blocks to display a collapsed view of the full transcripts and
`<summary>` tags to display the associated summary text).
if `format` is “mmd” they are converted to quill’s lever MMD format
(specifically designed for transcripts, particularly when clauses are split
on conjunctives).
“””
if out_domain:
from quill import AddressPath
ymd = [*self.date.timetuple()][:3]
out_dir = AddressPath.from_parts(domain=out_domain, ymd=ymd).filepath
elif out_dir is None:
raise ValueError(“No output directory supplied for transcript export”)
Done: use a transcript summaries template that is interpreted by quill.fold.wire
as ‘summary and detail’ where the MMD document itself indicates ‘Q&A’ https://github.com/lmmx/tap/commit/69ad95bedadad8d08bbb2ab8a6dc42bb577ef94c
Handled in wire by adding a nested_qa_summary
parameter corresponding to a Sink
attribute turned on in the RadioTranscriptSummariesDocSink
class (indicated in the sink enum by the radio_transcript_summaries
key) https://github.com/spin-systems/quill/commit/8db86be58961ceb9bb96c56b05a21319c6ec54ea
enum.Enum
for a aenum.NamedConstant
(some minor interface differences) so that -:
prefixes can indicate either a Prefix.Answer
or a Prefix.InitList
token dependent on context
quill
will build nested summarisation pages likewire