suchnsuch / Tangent

The publicly-available modules of the Tangent project.
65 stars 5 forks source link

Copy / Pasting from external sources can sometimes include too many new lines or not enough #2

Open taylorhadden opened 1 year ago

taylorhadden commented 1 year ago

The following is from a Twitter DM:

It's something to do with line breaks, CR LFs or something. One time I pasted in text that in the source I copied from was on multiple lines, but when I pasted it into tangents it all pasted on one line. E.g.:

"This was
On
Multiple lines"

became:

"This wasOnMultiple lines"

When I copied it back out of tangents I believe it pasted back onto multiple lines so the new line characters were still there but not rendering.

Confusingly I also have once copy pasted in and had text get extra blank lines between each line:

"Text that looked
Like this"

became:

"Text that looked

Like this"

That was from a rich text source.

abstractrealism commented 1 year ago

For the GitHub record, I couldn't remember how to reproduce the first problem. I will update if I can remember. For the second (extra line breaks), here are two examples. The first is from Google Sheets:

<google-sheets-html-origin style="font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">

1
--
2
3
4

</google-sheets-html-origin>

becomes:

1  

2  

3  

4

The expected output would be:

1
2
3
4

And from a photoshop extension I was debugging:

keyName: V
keyIdentifier: U+0076
keyLocation: 4294967295
ctrlKey: false
altKey: false
shiftKey: false
metaKey: false

became:

keyName: V

keyIdentifier: U+0076

keyLocation: 4294967295

ctrlKey: false

altKey: false

shiftKey: false

metaKey: false

The .md file with these examples is attached, in case copy/pasting into GitHub messed things up. Tangents text examples.md

taylorhadden commented 1 year ago

The latest code in tangent-html-to-markdown produces reasonable output from a copy/paste from google sheets. Tangent itself will update soonish.

I'd love to see the raw html input pulled from that PS plugin. The Tangent v0.3.4-beta.1 and up will log the text/html content of the clipboard when you paste into Tangent.

abstractrealism commented 1 year ago

Here's that raw HTML. The issue with the text formatting code showing up is indeed fixed!


<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Style-Type" content="text/css">
<title></title>
<meta name="Generator" content="Cocoa HTML Writer">
<meta name="CocoaVersion" content="2113.5">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 11.0px 'Helvetica Neue'; color: #000000; color: rgba(0, 0, 0, 0.85)}
</style>
</head>
<body>
<p class="p1">keyName: F</p>
<p class="p1">keyIdentifier: U+0066</p>
<p class="p1">keyLocation: 4294967295</p>
<p class="p1">ctrlKey: false</p>
<p class="p1">altKey: false</p>
<p class="p1">shiftKey: false</p>
<p class="p1">metaKey: false</p>
</body>
</html>

When I look in the Elements pane, it looks like an extra paragraph tag containing a break is showing up between each of those:


<p style="--lineIndent: 0;" class=" line">
<br>
</p>
<p style="--lineIndent: 0;" class=" line">
metaKey: false
</p>
<p style="" class=" line">
<br>
</p>
</article>