TuftsBCB / medford

Human-readible metadata file format to consolidate research information such that it can be stored, updated, and submitted to databases without introducing a huge time investment overhead.
MIT License
9 stars 5 forks source link

Trailing Whitespaces Should be Removed #9

Closed infispiel closed 2 years ago

infispiel commented 3 years ago

Thought of in issue #6, but being documented separately as it is not critical for inline comment functionality. When an in-line comment is used as follows:

@major-minor body # comment

The data will be stored as follows:

[
    'major': [ 
        'minor' : ['desc': 'body ']
    ]
]

Technically, this trailing whitespace after body should be removed. The most straightforward approach -- delete any trailing whitespaces when creating the detail to begin with -- isn't necessarily the correct response. Envision the following case:

@major-minor body # comment
more body here

Technically, if there were later tools to export the body of a MEDFORD file into notes, we would want to keep the trailing space so that we get the text "body more body here" instead of "bodymore body here". (I am affectionately dubbing this the 'pdf copy problem'.)

There should probably be some logic somewhere that goes through completed details and removes all trailing whitespaces, as when the line with the undesired space is being parsed, we have no knowledge as to whether the detail is complete or not. It is unclear whether this should be in detailparser or in the output step, after it's been turned into a pydantic model.

infispiel commented 3 years ago

Realized that this was already partially implemented in cases without comments as follows:

line = line.strip()

and when more data is added to a detail, it calls addData, which entirely consists of:

self.Data = self.Data + " " + line

So... for now I'll adjust logic to do this for lines with comments too, and for multiline macros. One more step that can resolve this entirely would be to check if the data being added begins with a whitespace, as that is the only case I can imagine the user wouldn't want a space added...