Closed artshade closed 1 month ago
Hello :wave:
Thank you for taking the time to open this issue with floccus. I know it's frustrating when software causes problems. You have made the right choice to come here and open an issue to make sure your problem gets looked at and if possible solved. I'm Marcel and I created floccus and have been maintaining it ever since. I currently work for Nextcloud which leaves me with less time for side projects like this one than I used to have. I still try to answer all issues and if possible fix all bugs here, but it sometimes takes a while until I get to it. Until then, please be patient. Note also that GitHub is a place where people meet to make software better together. Nobody here is under any obligation to help you, solve your problems or deliver on any expectations or demands you may have, but if enough people come together we can collaborate to make this software better. For everyone. Thus, if you can, you could also have a look at other issues to see whether you can help other people with your knowledge and experience. If you have coding experience it would also be awesome if you could step up to dive into the code and try to fix the odd bug yourself. Everyone will be thankful for extra helping hands! One last word: If you feel, at any point, like you need to vent, this is not the place for it; you can go to the forum, to twitter or somewhere else. But this is a technical issue tracker, so please make sure to focus on the tech and keep your opinions to yourself.
I look forward to working with you on this issue Cheers :blue_heart:
It seems like parsing and both DOM processing is based on custom text manipulation instead of actual DOM processing (e.g. document.createElement
) considering:
Since it should be supported by other utilities, encoding data like bookmark titles with Base64 is probably not an option. Yet, have you considering an actual DOM processing with more safe features, including textContent?:
const dt = document.createElement('dt');
const a = document.createElement('a');
a.textContent = '123<script>456';
dt.appendChild(a);
console.log(dt);
// <dt><a>123<script>456</a></dt>
Hey @artshade The thing is that we cannot make use of DOM methods, because floccus sync happens in a service worker...
Hey @artshade The thing is that we cannot make use of DOM methods, because floccus sync happens in a service worker...
Just to clarify, have you tried alternatives like jsdom
?
Please check out a quick Node experiment at StackBlitz, which currently results in:
{
nodeVersion: 'v18.20.3',
bookmarks: [
{ href: 'https://example.com', title: '12"3"<script>456' },
{ href: 'https://example.com/test', title: '456<script>"7"89' }
],
serializedBookmarks: '<html><head></head><body><dt><a href="https://example.com">12"3"<script>456</a></dt><dt><a href="https://example.com/test">456<script>"7"89</a></dt></body></html>',
unserializedBookmarks: [
{ href: 'https://example.com/', title: '12"3"<script>456' },
{ href: 'https://example.com/test', title: '456<script>"7"89' }
]
}
if (!('serviceWorker' in navigator)) {
throw new Error('Service worker is not supported.');
}
await navigator.serviceWorker.register('serviceWorker.js');
navigator.serviceWorker.addEventListener('message', e => {
if (e?.data?.type === 'unserialized') {
return;
}
const unserializedBookmarks = e.data.data;
console.log({
unserializedBookmarks
});
});
navigator.serviceWorker.addEventListener('message', e => {
if (e?.data?.type === 'serialized') {
return;
}
const serializedBookmarks = e.data.data;
console.log({
serializedBookmarks
});
navigator.serviceWorker.controller.postMessage({
type: 'unserialize',
data: serializedBookmarks
});
});
// ----------------------------------------------------------------
const bookmarks = [
{
href: 'https://example.com',
title: '12"3"<script>456',
},
{
href: 'https://example.com/test',
title: '456<script>"7"89',
},
];
console.log({
bookmarks
});
navigator.serviceWorker.controller.postMessage({
type: 'serialize',
data: bookmarks
});
// serviceWorker.js
import { JSDOM } from 'jsdom';
function serialize(bookmarks) {
const dom = new JSDOM('');
bookmarks.reduce((r, c) => {
const dt = r.document.createElement('dt');
const a = r.document.createElement('a');
a.setAttribute('href', c.href);
a.textContent = c.title;
dt.appendChild(a);
r.document.body.appendChild(dt);
return r;
}, dom.window);
return dom.serialize();
}
function unserialize(serialized) {
const domBody = new JSDOM(serialized).window.document.body;
return Array.from(domBody.querySelectorAll('dt')).reduce((r, dt) => {
const a = dt.querySelector('a');
return [
...r,
{
href: a.href,
title: a.textContent,
},
];
}, []);
}
self.addEventListener('message', e => {
if (e?.data?.type !== 'serialize') {
return null;
}
self.clients.matchAll().then(clients => {
clients.forEach(client => {
client.postMessage({
type: 'serialized',
data: serialize(e.data.data)
});
});
});
});
self.addEventListener('message', e => {
if (e?.data?.type !== 'unserialize') {
return null;
}
self.clients.matchAll().then(clients => {
clients.forEach(client => {
client.postMessage({
type: 'unserialized',
data: unserialize(e.data.data)
});
});
});
});
I've been actively trying to avoid JSDOM because it's huge. Additionally, the bookmarks html spec does not accept just any html with the right elements, but a very specific subset of HTML that is best hand-crafted, IMO. You can try generating a html file with document.createElement and importing it into a browser: Firefox at least will only take over the folder hierarchy if it has a certain format.
I'm not using regular expressions to parse HTML. I'm using a parser.
I'm not using regular expressions to parse HTML. I'm using a parser.
s/parser/"parser"/
I'm not inclined to continue this conversation unless you are committing to constructive communication. Please refrain from making snark remarks. If you have something to say, say it outright.
Concretely, referring back to your example:
<html><head></head><body><dt><a href="https://example.com/">12"3"<script>456</a></dt><dt><a href="https://example.com/test">456<script>"7"89</a></dt></body></html>
This may be valid html, but it's not a valid bookmarks file. Firefox for example, will fall back to just importing the links without any hierarchy.
I'm not inclined to continue this conversation unless you are committing to constructive communication. Please refrain from making snark remarks. If you have something to say, say it outright.
Considering both the initial bot message and this response, you seem to be more focused on flattery and conversations rather than anything constructive indeed.
No one made any "snark remarks" but higlighted possible nonsense in ego-less manner.
I am sorry if I did tell anything wrong also investing time into the experiment, source code research of this project, and some additional researches. Sure, I won't respond you anymore, since I am not sure you receive the communication fairly enough. Please stay safe, @marcelklehr , and hopefully you will find peace in communicating with people around the globe and consider their invested time, too, indeed.
We all are developers, I believe. Developers try to improve the world as much as possible however they can, don't they?
Then let's be constructive: Why is floccus currently using a "parser" instead of a parser to parse the bookmarks format?
Which version of floccus are you using?
5.2.6
How many bookmarks do you have, roughly?
20k
Are you using other means to sync bookmarks in parallel to floccus?
No
Sync method
WebDAV
Which browser are you using? In case you are using the phone App, specify the Android or iOS version and device please.
Google Chrome, 127.0.6533.120 (Official Build) (64-bit)
Which version of Nextcloud Bookmarks are you using? (if relevant)
No response
Which version of Nextcloud? (if relevant)
No response
What kind of WebDAV server are you using? (if relevant)
RClone WebDav (built-in server).
Describe the Bug
Sincere appreciation for the project! Thank you! ✨
When tried syncing bookmarks with the following bookmark added, the whole set of bookmarks got missing in clients after manual import. A bookmark example with
<script>
tag:Title:
HTML - <script> Tag - Tutorialspoint
URL:https://www.tutorialspoint.com/html/html_script_tag.htm
Expected Behavior
Safe parsing of bookmarks data.
To Reproduce
Initial state of bookmarks
Push
The bookmarks file on WebDav remote:
Pull attempt
Pull ("fail-safe" disabled)
Push (2 time)
The bookmarks file on WebDav remote:
Pull (2 time; "fail-safe" disabled)
Imported the version from remote
Debug log provided