Redgram / redgram-for-reddit

An Open-Sourced Android Reddit Client
GNU General Public License v3.0
102 stars 19 forks source link

App - Parsing text using Markdown #43

Open mhdatie opened 7 years ago

mhdatie commented 7 years ago

https://github.com/Redgram/redgram-for-reddit/tree/dev/Redgram/app/src/main/java/com/matie/redgram/ui/common/utils/text

mhdatie commented 7 years ago

@TeeKoo

mhdatie commented 7 years ago

Reasources

https://www.reddit.com/r/reddit.com/comments/6ewgt/reddit_markdown_primer_or_how_do_you_do_all_that/c03nik6/

https://www.reddit.com/r/regex/comments/5kz24n/need_some_assistance_in_mimicking_markdown_with/dbrqjl5/

mhdatie commented 7 years ago

StringDecorator class is used to build a Spannable string by appending pieces of text and specifying the Span of each append operation. It doesn't parse a full text as it acts like a builder.

We can add a method to the builder that parses markdown only. Then call this method in a custom Textview. The builder has reference to the current class and it sets the Spannable text.

The changes would include using this custom view instead of TextView in the XMLs and call the custom method since setText(Charsequence t) is declared final, so it cannot be overriden.

A second way is to use StringDecorator builder and use the existing TextView.

A third way is to keep StringDecorator untouched and use the custom TextView to parse any markdown text.

Note: We will need the parser for the custom EditText on text change for link and comment submission..so the same parser need to be available in a common place.

mhdatie commented 7 years ago

I have decided to use the Builder Pattern. This is because I wanted to make sure that we apply the spans that we need and not have a default impl that could be overriden. If in the future we want to add our custom spans or we give the user the ability to choose which spans they want or the type of action they want to perform, we need to have some kind of control over what to show and over the kind of span that we need to apply on the whole text based on a Regex pattern.

So there are default methods that all use the some final parse(...) method. We pass to the builder the text and view, and after applying the spans one by one, we call the build() method to set the spannable text to the view.

All parsing methods accept a common parameter, which is Object... spans. The reason for this choice of parameter is that when finding matches using regular expression, we want to span the markdown AND replace the markdown with the actual text, while keeping the spans applied. Multiple spans is to be able to apply a set of spans first and then replace the text with the matcher grouping in the specified regex.

This is ongoing, because we will need to apply this parsing also on EditText after each key stroke for example to detect changes in the text, so it needs to be efficient.

Note: not tested but commented out a way to use it on a thread. (See PostFragment.java)

mhdatie commented 7 years ago

https://www.reddit.com/r/HFY/wiki/ref/faq/formatting_guide

mhdatie commented 7 years ago

Three issues that I'm encountering:

What I can do for the third point is so modify one match only, then run the Matcher again for the next one, etc but that is inefficient.

For the first point I have to somehow create new spans for each match or apply the span for each Mardown operation separately instead of passing Span objects as parameters.

I will tackle the second issue in the end

mhdatie commented 7 years ago

The latest commit solves the first and second scenarios

I still need to know how to only display the captured groups on the fly. Substrings and replacing by the groups captured is not doable as it alters the string, so maybe the best way is to build a final Spannable from left to right?

mhdatie commented 7 years ago

All texts in links display https://www.reddit.com/r/regex/comments/5kz24n/need_some_assistance_in_mimicking_markdown_with/dbrqjl5/

I also commented out the Cache interceptor in order to avoid Http errors #18

mhdatie commented 7 years ago

I believe we will need to capture the data in groups and store them somehow and build a new spannable string with only the data. Right now the span is applied on the whole Markdown captured by regular expressions.

mhdatie commented 7 years ago

Latest Resources on MD

https://www.reddit.com/r/reddit.com/comments/6ewgt/reddit_markdown_primer_or_how_do_you_do_all_that/

https://www.reddit.com/r/changelog/comments/mg1j6/reddit_change_new_markdown_interpreter/

mhdatie commented 7 years ago

I'm compiling a list of regex patterns (with examples) for the following:

IMPORTANT:

Needs optimizing:

Some regex engines cannot capture multiple groups for nested markdown. It ends up capturing the last group only. So I believe we will need to apply the markdown on the first level, apply the MD, replace the text and check for any nested level, and then repeat.

Implementation will continue as soon as the full list is finalized.

mhdatie commented 7 years ago

I needed a way to determine the line number based on the most updated text.

I had to introduce a custom SpannableStringBuilder MDSpannableStringBuilder that triggers a listener on each span addition. In the case of SuperScriptSpans I had to trigger the listener only when the last span is added. SO I added a trigger flag so the listener isn't invoked on the fly all the time.

After setting the new text to the TextView, view.getLayout() is usually null as it's not instant. To solve this, I added:

ViewTreeObserver vto = view.getViewTreeObserver(); vto.addOnGlobalLayoutListener(() -> {...

This global listener operates on the TextView in context only. However, since it's possible to have multiple superscripts that need update and the fact that the global tree listener is NOT synchronous, I had to add the spans that I receive in a Queue and later I operate on each separate span based on a FIFO approach.

The global listener is only triggered when a view change is made (ex. setText()), so on every span affected, I had to update the text, which will trigger the listener again, and operate on any span residing in the Queue