snosov1 / toc-org

toc-org is an Emacs utility to have an up-to-date table of contents in the org files without exporting (useful primarily for readme files on GitHub)
GNU General Public License v3.0
292 stars 29 forks source link

doesn't strip TODO progress cookies #24

Closed VladimirAlexiev closed 8 years ago

VladimirAlexiev commented 8 years ago

Related to #3: it still doesn't strip TODO progress cookies. See (info "(org)Breaking Down Tasks"). These look like

     * Organize Party [33%]
     ** TODO Call people [1/2]
     *** TODO Peter
     *** DONE Sarah
     ** TODO Buy food
     ** DONE Talk to neighbor

Eg in my case, 3 of the headlines below have such cookies, and the generated links are invalid:

 - [[#intro][Intro]]
   - [[#queries][Queries]]
 - [[#validation-116][Validation [1/16]]]
   - [[#entity-linking-service-01][Entity Linking Service [0/1]]]
     - [[#underscores-to-spaces][Underscores to Spaces]]
   - [[#advanced-context-extraction-service-13][Advanced Context Extraction Service [1/3]]]
     - [[#wrong-prefix-for-text-characteristics][Wrong prefix for Text Characteristics]]
     - [[#crawler-to-decode-html-entities][Crawler to decode HTML entities]]
     - [[#keywords-vs-category][Keywords vs Category]]
snosov1 commented 8 years ago

Just to be clear, you're talking about [33%] and [1/2] cookies in the end of the headlines?

VladimirAlexiev commented 8 years ago

Yes. May I suggest you explore existing org functions for doing all this stuff, rather than working with raw text (and writing your own regexps). All of this is already written, eg for the HTML exporter, but I think also in the org core itself.

Eg your toc-org-states-regexp catches only TODO and DONE. But in org, I can define my own TODO keywords in a #+TODO: header line

VladimirAlexiev commented 8 years ago

See (apropos "statistics-cookie")

VladimirAlexiev commented 8 years ago

quick patch, regexp from file:~/.emacs.d/elpa/org-plus-contrib-20150803/org-element.el::(defun org-element-statistics-cookie-successor ()

      ;; strip statistics cookies
      (goto-char (point-min))
      (while (re-search-forward "\\[[0-9]*\\(%\\|/[0-9]*\\)\\]" nil t)
        (replace-match "" nil nil))
snosov1 commented 8 years ago

May I suggest you explore existing org functions for doing all this stuff, rather than working with raw text (and writing your own regexps). All of this is already written, eg for the HTML exporter, but I think also in the org core itself.

I would love to, and I try to do this whenever I can. But the thing is, a lot of the time it's not feasible due to one reason or another.

Like, for this particular case, the (apropos "statistics-cookie") call gives me

Type RET on a type label to view its full documentation.

:filter-statistics-cookie
  Variable: (not documented)
:with-statistics-cookies
  Variable: (not documented)
org-ascii-statistics-cookie
  Function: Transcode a STATISTICS-COOKIE object from Org to ASCII.
org-element-statistics-cookie-interpreter
  Function: Interpret STATISTICS-COOKIE object as Org syntax.
org-element-statistics-cookie-parser
  Function: Parse statistics cookie at point, if any.
org-export-filter-statistics-cookie-functions
  Variable: List of functions applied to a transcoded
            statistics-cookie.
  Properties: variable-documentation
org-export-with-statistics-cookies
  User option: Non-nil means include statistics cookies in export.
  Properties: standard-value custom-version custom-package-version
              custom-type custom-requests variable-documentation
org-html-statistics-cookie
  Function: Transcode a STATISTICS-COOKIE object from Org to HTML.
org-latex-statistics-cookie
  Function: Transcode a STATISTICS-COOKIE object from Org to LaTeX.
org-update-statistics-cookies
  Command: Update the statistics cookie, either from TODO or from
           checkboxes.

So, there's nothing I can use out of the box.

Then, your function org-element-statistics-cookie-successor doesn't seem to be an "official" API, so it's subject to any changes. For example, my version of org (8.3.2) doesn't seem to have it at all.

Another example - just yesterday, I tried to use org-drawer-regexp variable to skip the drawers. Turns out, the value is different in 8.3.2 and 8.2.10 to the extent, where I need to add an if branch in the application code to differentiate between the versions. This is just an overkill compared to simply copying the regexp.

So, I'll do just that with your snippet - simply copy-paste it. Thx! =)

VladimirAlexiev commented 8 years ago

My point is that org syntax is more involved than it appears. Eg your toc-org-states-regexp catches only TODO and DONE. But in org, I can define my own TODO keywords in a #+TODO: header line. So better to get the pure headline that you need using a function, rather than writing your own. Maybe the code I suggested org-element is contributed not official, but I hope in the official distribution there's enough parsing functions. Eg org can make a table of headings with extra info, see (info "(org)Column view"), so obviously it can extract the pure headline.

If a regexp is different between 8.3.2 and 8.2.10, I think you should use the official one, not a fixed regex in your code. Else your regex may not work in my version of org.

snosov1 commented 8 years ago

My point is that org syntax is more involved than it appears.

Totally understood.

Eg your toc-org-states-regexp catches only TODO and DONE. But in org, I can define my own TODO keywords in a #+TODO: header line.

But will GitHub treat it correctly? :wink:

but I hope in the official distribution there's enough parsing functions

This particular package tries to balance between HTML renders (i.e. GitHub) and the richness of org facilities. So, there's a lot of unusual issues that needs to be taken into account, e.g. github does something one way, org does something the other way; different versions of org behave differently; different languages work differently, etc.

After all the time I've been maintaining the package it's now abundantly clear, that trying to use org functions as much as possible brings more trouble than good. Again, this is mostly because the package tries to be a smooth bridge between the 2 worlds - GitHub render and Org-mode.

VladimirAlexiev commented 8 years ago

Ideally, links should work in Github, Org and HTML export. Right now they work only in Org, see #28 :-(

snosov1 commented 8 years ago

Ideally, links should work in Github, Org and HTML export.

My main focus is to make them work in both Github and Org. I see the problems you've pointed out in #28 and will work on them. Most likely, I'll fix these next week.

As for HTML export - I don't really have an intention to work on that. My understanding is that you'd be better off using native org-export utilities for that. That said, maybe there's a simple way to make all 3 work at the same time, but it's unlikely that I'll be implementing this myself.

VladimirAlexiev commented 8 years ago

They don't work in Github and Org since the former makes eg #TODO-underscores-to-spaces-12 but the latter uses eg #underscores-to-spaces.

Please amend the regex to gobble surrounding space:

(defconst toc-org-statistics-cookie-regexp " *\\[[0-9]*\\(%\\|/[0-9]*\\)\\] *"
  "Regexp to find statistics cookies on the headline, eg [1/3] or [33%]")
snosov1 commented 8 years ago

They don't work in Github and Org since the former makes eg #TODO-underscores-to-spaces-12 but the latter uses eg #underscores-to-spaces.

Yes, I understand that it doesn't work now =) But I'll fix this.

It's just the issue is a bit more complicated than simply leaving the TODO state in place. You also need to look whether #+OPTIONS: todo:t is present in the file. Because if it doesn't - github strips the TODO state (currently, toc-org always strips TODO states)

And for the cookies - you can't just strip them prior to calling the hrefify function, since the cookie is needed to generate the github link.

I'll make it work as soon as I get to it.

snosov1 commented 8 years ago

duplicate of #28 and #29