Ocelot-Social-Community / Ocelot-Social

Free and open-source social network for active citizenship.
https://ocelot.social
Other
92 stars 36 forks source link

🐛 [Bug] Embeds are not displayed #2259

Closed Tirokk closed 3 years ago

Tirokk commented 3 years ago

roschaefer Authored by roschaefer Closed

:bug: Bugreport

Probably after #2242 the previous embeds are not displayed anymore. See:

Screenshot - 2019-11-16T185240 630

Steps to reproduce the behavior

It's difficult to reproduce since you need old production data. But the above post is here: https://human-connection.social/post/fc2dd986-c8fb-4251-a653-63541a844b66/meine-erste-vorlesung-an-der-htw-berlin-hochschule-fur-technik-und-wirtschaft

Expected behavior

The editor should parse the old embeds just fine.

Additional context

https://human-connection.social/post/43fb27ec-fa26-4c42-88ed-179542796358/human-connection-5#commentId-c6440f59-e266-4888-9e96-6831ac249c13

Tirokk commented 3 years ago

roschaefer Authored by roschaefer


@MarcoBomfim :point_up:

Tirokk commented 3 years ago

MarcoBomfim Authored by MarcoBomfim


@roschaefer does this happened to all other embeds that were already published? smells like this might be happening because the old data is still pointing to an 'inline' component, but the current embed component is a block.

I'll take a look on this throughout the day and will update this thread with any findings. Thanks for the heads-up!

Tirokk commented 3 years ago

roschaefer Authored by roschaefer


Here is the difference: Broken embedded link 2019-11-19-140659_1920x1080_scrot

Recently created embedded link, not broken 2019-11-19-140953_1920x1080_scrot

Seems the only difference is that an inline link is inside the <p> tags whereas the new links are between those those paragraphs.

Tirokk commented 3 years ago

ogerly Authored by ogerly


and I see it this way ... Bildschirmfoto vom 2019-11-19 15-06-58

Tirokk commented 3 years ago

ogerly Authored by ogerly


this is the test link : https://peertube.social/videos/watch/9c9de5e8-0a1e-484a-b099-e80766180a6d I can't make the mistake. Bildschirmfoto vom 2019-11-19 15-13-37

Tirokk commented 3 years ago

roschaefer Authored by roschaefer


@ogerly it's not about new data. New embed links are saved just fine. It's about the existing links in the database which are not displayed as embed links.

Tirokk commented 3 years ago

ogerly Authored by ogerly


because the class="embed" is missing ... and were not entered via the editor. so they are cleaned when loading html code. and old links that don't have the class are gone ...

Tirokk commented 3 years ago

roschaefer Authored by roschaefer


Are you absolutely sure @ogerly :thinking: ? We did not change anything about the css class recently: https://github.com/Human-Connection/Human-Connection/blob/master/webapp/components/Editor/nodes/Embed.js#L45

Tirokk commented 3 years ago

MarcoBomfim Authored by MarcoBomfim


@ogerly, @roschaefer, I'm finishing up a sprint here on the company I work on, so I haven't had much time to check out this bug. Can any of you clarify whether we are able to fix the data by any manual way? Or do we need to rollback, find a way to sort out old production data, and then rollout the embed in block?

I'm planning on partaking the meeting on Friday, but just hit me on Discord if anything comes up. I'm 100% willing to help out on this fix, even though I don't have a lot of time to work on it.

Tirokk commented 3 years ago

mattwr18 Authored by mattwr18


hey that's great @MarcoBomfim ... people seem quite stressed about this on the network..

@ogerly and @roschaefer, the class is not there for the contentExcerpt, but is present for content and this is what is important for displaying the embeds, I guess.

@MarcoBomfim I have "fixed" it for one Post, on nitro-staging, @roschaefer Post from the screen shot :point_up_2: Screenshot from 2019-11-20 18-06-37

I did this by analyzing the difference between the content of the Post as it was saved in the database and the content of a new comment with the same link, but the new block element. I made note of the difference between the inline and block elements and then changed the Post content accordingly... here is before

"<p>Hier ist die Videoaufzeichnung auf Peertube: <a href=\"https://tube.tchncs.de/videos/watch/05ed3e62-77b1-4ac7-bd20-793e2f86bcfd\" class=\"embed\" target=\"_blank\"></a>und das gleiche nochmal auf der kommerziellen Plattform <a href=\"https://www.youtube.com/watch?v=kDLg8oEWFYY\" target=\"_blank\">Youtube</a>.</p><p>Meine Vorlesungsfolien findet ihr hier: <a href=\"https://speakerdeck.com/roschaefer/systems-development-and-frameworks-number-0-introduction\" class=\"embed\" target=\"_blank\"></a></p>"

and after

"<p>Hier ist die Videoaufzeichnung auf Peertube: </p> <a href=\"https://tube.tchncs.de/videos/watch/05ed3e62-77b1-4ac7-bd20-793e2f86bcfd\" class=\"embed\" target=\"_blank\"></a><p>und das gleiche nochmal auf der kommerziellen Plattform</p><a href=\"https://www.youtube.com/watch?v=kDLg8oEWFYY\" target=\"_blank\">Youtube</a>.<p>Meine Vorlesungsfolien findet ihr hier:</p> <a href=\"https://speakerdeck.com/roschaefer/systems-development-and-frameworks-number-0-introduction\" class=\"embed\" target=\"_blank\"></a>"

As you can see :point_up_2: this closes all <p> tags before the start of any <a> tags and removes any redundant closing </p> tags.

I went to a Meetup one time where a guy from Gitlab talked about how they wrote their own markdown parser that goes through the text and does something similar to this.

I believe we could write a script that would iterate through every post's content and look for an opening <p> tag that if is followed by an opening <a> tag before it is closed, and if it finds one, inserts a closing </p> before the <a>. It also removes any dangling closing </p> tags. This could then be done in an automated way, but it doesn't sound simple at all. Complicated by the fact that it's not just simple text, but involves making cypher queries, checking for some conditions, and if present updating a User's post.

The easiest thing to do is to edit your post or comment to re-add the link, of course this isn't nice to ask our users to do since we are the ones who accidentally broke the functionality, and should be responsible for cleaning it up.

Still, unless someone on the team or the open-source community has some skills to pull off something like :point_up_2: in short order, it might be good to pin a Post asking any users that have noticed their disappearing links to please re-paste them in while we work on a comprehensive fix.

Or maybe there are other ideas on how to fix it?

Tirokk commented 3 years ago

MarcoBomfim Authored by MarcoBomfim


@mattwr18, thanks! About:

I believe we could write a script that would iterate through every post's content and look for an opening

tag that if is followed by an opening tag before it is closed, and if it finds one, inserts a closing

before the . It also removes any dangling closing

tags.

This is what I meant by a manual approach indeed, and about the note for the users, back when I worked on the support, customers were always more concerned with not having a way to get content back, then to being told to do some steps, there are few cases where the users get actually mad due to having to simply re-paste an URL, so I think it could be a great start while I take a look on the issue.

Is this something that can be rolled back though? Or are users facing this for new content as we speak?

Tirokk commented 3 years ago

mattwr18 Authored by mattwr18


thanks for the quick reply @MarcoBomfim ... I agree with your assessment about the users wanting assurance that the link still exists and how they can get it to display properly. I thought about rolling it back, we even agreed it might be a good fast step, but

1) that would mean all links adding since then would be suddenly broken, and
2) it didn't work when I tried it out locally, so I'm not confident it would even work in the first place, unfortunately.
Tirokk commented 3 years ago

mattwr18 Authored by mattwr18


@roschaefer @MarcoBomfim ... I made some little progress on this today, maybe (?) having a look at some production data, I noticed that it seems that every embed that was added since the new alpha has been up has a class=\"embed\" in it's content. Old embeds have data-url-embed.

I started to write a script that would isolate the Posts that have embed content since the new alpha has been up, and then return the content, convert to block elements, then update the Post... maybe there is a better solution, but I haven't see any...

The cypher statement looks like this

MATCH (post:Post) 
WHERE post.content contains 'class=\"embed\"' 
RETURN post.content

this is a count of all posts in production that meet the match above

match (post:Post) 
           where post.content contains 'class=\"embed\"' 
           return count(post);
+-------------+
| count(post) |
+-------------+
| 3038        |
+-------------+

I played around with using Nokogiri some docs here https://nokogiri.org/tutorials/modifying_an_html_xml_document.html

And was able to do some interesting things like wrap the anchor tags in <p> tags (don't think this would solve our problem though), I was able to select <p> and <a> inside <p> tags and move them to be the next sibling using add_next_sibling(), but this also doesn't solve the issue cause the Post :point_up_2:, it moves the first anchor tag outside the first paragraph and therefore breaks the flow of the paragraph. what it should do is add a closing </p> before the opening <a> and an opening one after the </a>....

I also looked at how to get the text of a <p> tag up till another tag, like <a> and then change the text to append a </p>, but I haven't gotten there... also this all seems a bit dangerous as there will almost certainly be cases I don't think of, and we don't test properly with our develop data, or nitro-staging data, and it's scary talking about changing user data like so.