Field values don't account for markdown line-wrapping rules

iamrecursion commented 2 years ago

Describe the bug When a field has a value that wraps over multiple lines (according to the markdown rules about line breaks), the additional lines that join to it in rendered view are not kept as part of the field.

To Reproduce Steps to reproduce the behavior:

Create a note with the following contents (please note the explicit white spaces at the end of the **Test**:: line.

**Test**:: Value  
**Type**:: Foo,
bar

Query these properties, and view the value for the Type property. It will only contain "-Foo".
Open preview view on that note and enable "Strict line breaks" in the Obsidian Editor preferences. You will see that it displays as a coherent line.

Expected behavior By the markdown spec (enabled by "Strict Line Breaks" in Obsidian), the line **Type**:: ... should display as "Type:: Foo, bar". I would expect querying the corresponding dataview field to respect the same convention, as otherwise data isn't properly available when querying.

Desktop:

OS: macOS 12.0.1
Obsidian Version: 0.12.19
Dataview Version: 0.4.20

Smartphone (please complete the following information):

Device: iPhone 13 Pro Max
OS: iOS 15.1
Obsidian Version: 1.0.5
Dataview Version: 0.4.20

iamrecursion commented 2 years ago

Similarly, [] fields do not either, though I would not necessarily expect these to do so (as they are like any other structured markdown construct). However, as common autoformatters for markdown (e.g. prettier) don't understand this construct, it is likely that these constructs get wrapped. Hence, it makes sense to me that it should also work.

blacksmithgu commented 2 years ago

Properly parsing this requires a full markdown parser with some special edge case handling, to handle cases like

Field:: Value, 
Field2:: Value

or

Field:: Value.
- Am I in the field?
Field2::
- Now am I a list?

Improved behavior is coming, though it's a big step up from the simple line-based parsing I've been doing thus far.

iamrecursion commented 2 years ago

Definitely. Could you not depend on parsing the rendered source instead? That way your check becomes far simpler as you're just looking for a line beginning with a certain style of token. That, and you're consistent with how Obsidian deals with markdown. Am I missing something that makes that approach a non-starter?

That aside, you potentially could go with something like remark and add a specific parser plugin for your field syntax. I've done that before when working with prettier to add support for not breaking [[wiki-link]] syntax and while it's not trivial it also means much of the work in parsing has been done for you. Maybe this is helpful in that regard.

blacksmithgu commented 2 years ago

Dataview does markdown field parsing before the render stage, so I don't have access to rendered Markdown objects; also, parsing through HTML is a nightmare of it's own sort since it becomes way harder to parse things like bolds (**a** becomes <b>a</b>).

Using an actual markdown parser is a good idea - I'll need to investigate if I can replicate existing semantics using a parser.

iamrecursion commented 2 years ago

The only issue with using an actual markdown parser is that some of them don't handle Obsidian's default behaviour (no strict line breaks) by default, so you may also be on the hook for custom logic with regards to that. Something to keep in mind!

iamrecursion commented 2 years ago

This issue is bugging me more and more recently! Is it something you'd welcome external contribution with, or is it too core to dataview that you'd rather handle it yourself?

blacksmithgu commented 2 years ago

I'm fine with external contribution, though I would wait a few days for 0.5.0 to come out since it uses a full Markdown parser as opposed the jank regex stuff I do now, and it supports some line break functionality like so:

**Field**::
- Value 1
- Value 2

iamrecursion commented 2 years ago

Oh, wonderful! I'll wait on that first and then see if my use-cases still need some work!

iamrecursion commented 2 years ago

I'm fine with external contribution, though I would wait a few days for 0.5.0 to come out since it uses a full Markdown parser as opposed the jank regex stuff I do now, and it supports some line break functionality like so:
**Field**::
- Value 1
- Value 2

Is this functionality currently meant to work on HEAD? I've just tried it and it doesn't seem to be working for me.

blacksmithgu / obsidian-dataview

Field values don't account for markdown line-wrapping rules #622