Open mbbroberg opened 8 months ago
I was going through the index.html
Notion spits out and noticed that they have a syntax error if a note has "++ " in the title.
This leaves a broken list item that leads to a bunch of junky text in that file. I went a little deeper and found another file that parsed that way (it was a regex cheatsheet 🤦). This looks like it's a bug on Notion's export first and foremost. That said, the importer has an opportunity to sanitize its inputs.
Hope that's helpful.
I've following the Notion import process, using html as the output, and I'm now combing through my results. I have hundreds of files that look something like this:
As the only file contents. It seems to catch on Python, git, ruby, Go, and other command-line scripting notes like my "Make ZSH default shell.md" file. I found them all pretty quickly once I saw the pattern and used find:
And so on -- there were about 300+ from my 5k files. Swapping
-print
for-delete
took care of it. Seems like a bug in the parser and I wanted to share.UPDATE -- I missed quite a few misparsed files. The pattern seems to be that content got pushed into the YAML front matter. The filename isn't an indicator so I was writing a script to find them all.