orgapp / orgajs

parse org-mode content into AST
https://orga.js.org
MIT License
613 stars 61 forks source link

Weird paragraph structure / line breaks(gatsby-transformer-orga) #73

Open thomasheartman opened 3 years ago

thomasheartman commented 3 years ago

It seems that the orga-gatsby-transformer breaks paragraphs on single line breaks or otherwise incosistently. As someone who uses auto-fill-mode in Emacs, this is unexpected. Is this intentional or is it a bug?

As a clearer example, this following bit of text:

You've heard a lot about it, you've bought into the hype, and the day has
finally come. It's time for you to start writing Rust!

So you sit down---hands on the keyboard, heart giddy with anticipation---and
write a few lines of code. You run the ~cargo run~ command, excited to see
whether the program works as expected. You've heard Rust is one of those 'once
it compiles, it works' languages and want to test it for yourself. The compiler
starts up, you follow the output, when suddenly:

is rendered as this:

image which is this HTML:

<p> You've heard a lot about it, you've bought into the hype, and the day has </p>
<p>finally come. It's time for you to start writing Rust! </p>
<p>So you sit down---hands on the keyboard, heart giddy with anticipation---and write a few lines of code. You run the <code>cargo run</code> command, excited to see </p>
<p>whether the program works as expected. You've heard Rust is one of those 'once it compiles, it works' languages and want to test it for yourself. The compiler </p>
<p>starts up, you follow the output, when suddenly: </p>

This shows that the above two paragraphs are broken into five <p> elements and it's not entirely obvious to me how the algorithm works here. For instance, the last (org) paragraph is five lines, but it's broken up into three paragraphs, where the first two paragraphs contain two lines, and the last one contains one. The first org paragraph is broken into two separate paragraphs, one for each line.

This appears to have been introduced in a recent version of Orga (or or the gatsby-transformer), probably v2.

Thanks!

xiaoxinghu commented 3 years ago

@thomasheartman yeah, I have noticed it, and it should be fixed by this change. It's published, please try again.

thomasheartman commented 3 years ago

Oh, nice! I couldn't find release 2.2.4 on npm, though: the latest version is 2.2.3. Is 2.2.4 on the way up or did something go wrong with the publishing process? I tried using 2.2.3, but got a build error due to a missing module (I'm guessing it's the one referenced in this commit).

xiaoxinghu commented 3 years ago

Yeah, sorry about the mess, you caught me doing "testing in production" 😆 . It should be ok now. BTW, take a look at the updated gatsby theme (gatsby-theme-blorg) and the new starter project that utilises it, and it's documentation. Any feedback would be appreciated.

thomasheartman commented 3 years ago

I tried again today with versions 2.2.4 and 2.2.5, but both failed with the same reason. Here's the Netlify build log failure:

7:41:38 PM: $ gatsby build
7:41:40 PM: success open and validate gatsby-configs - 0.032s
7:41:40 PM: error Error in "/opt/build/repo/node_modules/gatsby-transformer-orga/gatsby-node.js": Cannot find module 'hast-util-is-element/convert'
7:41:40 PM: 
7:41:40 PM: 
7:41:40 PM:   Error: Cannot find module 'hast-util-is-element/convert'
7:41:40 PM:   
7:41:40 PM:   - loader.js:636 Function.Module._resolveFilename
7:41:40 PM:     internal/modules/cjs/loader.js:636:15
7:41:40 PM:   
7:41:40 PM:   - loader.js:562 Function.Module._load
7:41:40 PM:     internal/modules/cjs/loader.js:562:25
7:41:40 PM:   
7:41:40 PM:   - loader.js:692 Module.require
7:41:40 PM:     internal/modules/cjs/loader.js:692:17
7:41:40 PM:   
7:41:40 PM:   - v8-compile-cache.js:159 require
7:41:40 PM:     [repo]/[v8-compile-cache]/v8-compile-cache.js:159:20
7:41:40 PM:   
7:41:40 PM:   - index.js:4 Object.<anonymous>
7:41:40 PM:     [repo]/[hast-util-to-text]/index.js:4:15
7:41:40 PM:   
7:41:40 PM:   - v8-compile-cache.js:178 Module._compile
7:41:40 PM:     [repo]/[v8-compile-cache]/v8-compile-cache.js:178:30
7:41:40 PM:   
7:41:40 PM:   - loader.js:789 Object.Module._extensions..js
7:41:40 PM:     internal/modules/cjs/loader.js:789:10
7:41:40 PM:   
7:41:40 PM:   - loader.js:653 Module.load
7:41:40 PM:     internal/modules/cjs/loader.js:653:32
7:41:40 PM:   
7:41:40 PM:   - loader.js:593 tryModuleLoad
7:41:40 PM:     internal/modules/cjs/loader.js:593:12
7:41:40 PM:   
7:41:40 PM:   - loader.js:585 Function.Module._load
7:41:40 PM:     internal/modules/cjs/loader.js:585:3
7:41:40 PM:   
7:41:40 PM:   - loader.js:692 Module.require
7:41:40 PM:     internal/modules/cjs/loader.js:692:17
7:41:40 PM:   
7:41:40 PM:   - v8-compile-cache.js:159 require
7:41:40 PM:     [repo]/[v8-compile-cache]/v8-compile-cache.js:159:20
7:41:40 PM:   
7:41:40 PM:   - index.js:5 Object.<anonymous>
7:41:40 PM:     [repo]/[rehype-highlight]/index.js:5:14
7:41:40 PM:   
7:41:40 PM:   - v8-compile-cache.js:178 Module._compile
7:41:40 PM:     [repo]/[v8-compile-cache]/v8-compile-cache.js:178:30
7:41:40 PM:   
7:41:40 PM:   - loader.js:789 Object.Module._extensions..js
7:41:40 PM:     internal/modules/cjs/loader.js:789:10
7:41:40 PM:   
7:41:40 PM:   - loader.js:653 Module.load
7:41:40 PM:     internal/modules/cjs/loader.js:653:32
7:41:40 PM:   
7:41:40 PM: 
7:41:40 PM: not finished load plugins - 0.415s
7:41:40 PM: ​
7:41:40 PM: ┌─────────────────────────────┐
7:41:40 PM: │   "build.command" failed    │
7:41:40 PM: └─────────────────────────────┘
7:41:40 PM: ​

Is this the same issue you've been working on or is it unrelated?

And thanks for the link to the new starter project. It looks nice! It's also good to get some default file properties to work with. One thing I thought of is that in the 'About Your Org Files ...' post, you mention that you can change which headlines should be published based on the #+orga_publish_keyword, property, but it's not immediately clear to me how this works. I think it would be better if you gave examples of both how to set the property to different values (you mention it can be an array: how? space-separated? comma-separated? JS-style?) and also of how it chooses headlines: is it based on tags or the actual headline text? Is it a regex or a substring? Is it case-sensitive? Does the keyword have to stand on its own, or can it be part of a longer word? Also, is this similar to the select_tags property that org mode uses for its own export? Does orga support this property?

Sorry if that became a bit rambly, but these are things I thought of when reading that paragraph.

xiaoxinghu commented 3 years ago

Ah! I had that problem too while migrating my existing projects. That is an odd issue that caused by a breaking change in the dependency chain within the unified ecosystem. The fix is to delete the yarn.lock or package-lock.json file (maybe also the node_modules folder for safety) and reinstall all dependencies. So yarn (or npm) will generate a new lock file. It seems that for some reason, upgrading the package hast-util-to-html did not update the version of hast-util-is-element (which is a dependency). Anyway, try that see if it fix it.

Good suggestions on the documentation. It's pretty new, I will try to make it more explicit in the details. Exactly the kind of feedback I need. Cheers.

It actually supports select_tags and exclude_tags properties, thanks for reminding me that, I will add it into the doc. But it's not for filtering posts, it's for selecting sections to export. Just like in org-mode. You can get the starter project and try things out. Let me know if you have more questions, I will try to add them to the documentation.

thomasheartman commented 3 years ago

I tried deleting the package-lock, but it didn't make a difference, I'm afraid: I'm still getting the same error. Could there be some other reason that I'm getting this? My package.json has these versions set:

    "gatsby-transformer-orga": "^2.2.5",
    "orga": "^2.2.0",

And I'm not sure I understand what you mean by 'it's not for filtering posts, it's for selecting sections to export'. Isn't that exactly what select_tags and exclude_tags take care of? You use them to say 'export only sections with these tags' or 'do not export sections with these tags'?

xiaoxinghu commented 3 years ago

Try to do this in your project npm install hast-util-is-element@1.1.0. This will force the upgrade to that version, then you can delete it from your package.json file, and do another npm install, which will tidy up your lock file. This is some kind of quirkiness with npm I don't want to get into. yarn works differently.

The tag properties are exactly matching the functionality in org-mode. What I mean by "not for filtering posts" is that the tags are not for deciding whether to publish that section into a post or not. Because you can publish sections into individual posts, #+orga_publish_keyword is the filtering for that. Maybe it's a little bit confusing, I will try to refine the doc, but in the meantime, try it out is the best way to understand it.

thomasheartman commented 3 years ago

If I manually install hast-util-is-element it works, but the build breaks again if I remove it from my package.json. Are you sure it's included correctly?

There's a new problem, though: There seems to have been a breaking change in between 2.0.0 and 2.2.9, because my graphql queries are suddenly broken now:

 ERROR #11321  PLUGIN

"gatsby-node.js" threw an error while running the onCreateNode lifecycle:

Cannot read property 'title' of undefined

  165 |
  166 |   if (node.internal.type === "OrgContent") {
> 167 |     const value = toUrl(node.metadata.title)
      |                                       ^
  168 |     createNodeField({
  169 |       name: `slug`,
  170 |       node,

File: gatsby-node.js:167:39

  TypeError: Cannot read property 'title' of undefined

  - gatsby-node.js:167 Object.exports.onCreateNode
    /home/thomas/projects/blog/gatsby-node.js:167:39

  - api-runner-node.js:256 runAPI
    [blog]/[gatsby]/dist/utils/api-runner-node.js:256:37

  - api-runner-node.js:375 Promise.catch.decorateEvent.pluginName
    [blog]/[gatsby]/dist/utils/api-runner-node.js:375:15

  - debuggability.js:384 Promise._execute
    [blog]/[bluebird]/js/release/debuggability.js:384:9

  - promise.js:518 Promise._resolveFromExecutor
    [blog]/[bluebird]/js/release/promise.js:518:18

  - promise.js:103 new Promise
    [blog]/[bluebird]/js/release/promise.js:103:10

  - api-runner-node.js:374
    [blog]/[gatsby]/dist/utils/api-runner-node.js:374:12

  - util.js:16 tryCatcher
    [blog]/[bluebird]/js/release/util.js:16:23

  - reduce.js:166 Object.gotValue
    [blog]/[bluebird]/js/release/reduce.js:166:18

  - reduce.js:155 Object.gotAccum
    [blog]/[bluebird]/js/release/reduce.js:155:25

  - util.js:16 Object.tryCatcher
    [blog]/[bluebird]/js/release/util.js:16:23

  - promise.js:547 Promise._settlePromiseFromHandler
    [blog]/[bluebird]/js/release/promise.js:547:31

  - promise.js:604 Promise._settlePromise
    [blog]/[bluebird]/js/release/promise.js:604:18

  - promise.js:649 Promise._settlePromise0
    [blog]/[bluebird]/js/release/promise.js:649:10

  - promise.js:729 Promise._settlePromises
    [blog]/[bluebird]/js/release/promise.js:729:18

  - async.js:93 _drainQueueStep
    [blog]/[bluebird]/js/release/async.js:93:12

And this:

ERROR #85923 GRAPHQL

There was an error in your GraphQL query:

Cannot query field "title" on type "Metadata".

If you don't expect "title" to exist on the type "Metadata" it is most likely a typo. However, if you expect "title" to exist there are a couple of solutions to common problems:

  • If you added a new data source and/or changed something inside gatsby-node.js/gatsby-config.js, please try a restart of your development server
  • The field might be accessible in another subfield, please try your query in GraphiQL and use the GraphiQL explorer to see which fields you can query and what shape they have
  • You want to optionally use your field "title" and right now it is not used anywhere. Therefore Gatsby can't infer the type and add it to the GraphQL schema. A quick fix is to add a least one entry with that field ("dummy content")

It is recommended to explicitly type your GraphQL schema if you want to use optional fields. This way you don't have to add the mentioned "dummy content". Visit our docs to learn how you can define the schema for "Metadata": https://www.gatsbyjs.org/docs/schema-customization/#creating-type-definitions

File: gatsby-node.js:53:24

Has there been an update to how you add custom fields to the Metadata object? It's been a while since I touched this stuff, but it built just fine on version 2.0.0 of the gatsby-transformer-orga package.