Closed ronaldtse closed 2 years ago
We probably want to strip out all \n
and \t
from the content that do not represent new paragraphs. They are formatting concerns that have no relevance to semantics.
@ronaldtse fixed but we need to keep <p>
. There is <t>
element in rfc7991. Nick asked me to replace <t>
with <p>
because in the metanorma <t>
isn't allowed. So when relaton parses BibXML it replaces <t>
with <p>
in abstracts. When BibXML is rendered it repaces <p>
with <t>
back.
@andrew2net I see, but is the data being updated? I don't see the data being update daily yet: https://github.com/ietf-ribose/relaton-data-ids/blob/main/data/DRAFT-3K1N-6TISCH-ALICE0-00.yaml
I want to verify that the whitespaces are stripped.
Regarding <p>
. I see that RFC 7991 does support some rich-text:
2.1. <abstract>
Contains the Abstract of the document. See [RFC7322] for more
information on restrictions for the Abstract.
This element appears as a child element of <front> (Section 2.26).
Content model:
In any order, but at least one of:
o <dl> elements (Section 2.20)
o <ol> elements (Section 2.34)
o <t> elements (Section 2.53)
o <ul> elements (Section 2.63)
And that RFC 7322 specifies that:
4.3. Abstract Section
Every RFC must have an Abstract that provides a concise and
comprehensive overview of the purpose and contents of the entire
document, to give a technically knowledgeable reader a general
overview of the function of the document.
Composing a useful Abstract generally requires thought and care.
Usually, an Abstract should begin with a phrase like "This memo ..."
or "This document ..." A satisfactory Abstract can often be
constructed in part from material within the Introduction section,
but an effective Abstract may be shorter, less detailed, and perhaps
broader in scope than the Introduction. Simply copying and pasting
the first few paragraphs of the Introduction is allowed, but it may
result in an Abstract that is both incomplete and redundant. Note
also that an Abstract is not a substitute for an Introduction; the
RFC should be self-contained as if there were no Abstract.
Similarly, the Abstract should be complete in itself. It will appear
in isolation in publication announcements and in the online index of
RFCs. Therefore, the Abstract must not contain citations.
So multiple paragraphs are allowed and it is fine to use <p>
for that.
but we need to keep
<p>
. There is<t>
element in rfc7991. Nick asked me to replace<t>
with<p>
because in the metanorma<t>
isn't allowed. So when relaton parses BibXML it replaces<t>
with<p>
in abstracts. When BibXML is rendered it repaces<p>
with<t>
back.
How Metanorma deals with text is technically of no concern to Relaton. Metanorma is only a consumer of Relaton data here.
What I am trying to get at is that the "abstract" should be in a text format that is interoperable. No one uses RFC 7991 text formatting outside IETF, and therefore we shouldn't either.
However since we do not yet have a clear spec of what rich-text format is to be used in the Relaton abstract
, I'm fine to leave this as is and define that in a separate Relaton issue.
Under
abstract > content
we need to strip away the wrapping (empty) tags and the surrounding empty space:https://github.com/ietf-ribose/relaton-data-ids/blob/d7fd11beadea199a170cdccde011d93cf4fee1e9/data/DRAFT-3GPP-COLLABORATION-01.yaml#L82-L85
==>