mfarragher / obsidiantools

Obsidian tools - a Python package for analysing an Obsidian.md vault
Other
402 stars 28 forks source link

regex improvements (tags) #17

Closed stepsal closed 1 year ago

stepsal commented 2 years ago

The current tags_regex is not parsing nested tags for me The proposed regex will also parse all tags from "sussudio.md" example without needing to modify the raw text beforehand. Would be a good improvement to move all the regexes into a constants.py file and load them from there.

Current regex: tags_regex=r'(?<!\()#{1}([A-z]+[0-9_\-]*[A-Z0-9]?)\/?'

Screenshot from 2022-09-13 11-45-21

Proposed Regex: tags_regex=r'(?<!\()(?<!\\)#{1}([A-z]+[0-9_\-]*[A-Z0-9]?[^\s]+(?![^\[\[]*\]\]))\/?'

Screenshot from 2022-09-13 11-34-49

louis030195 commented 1 year ago

I also need this regex

mfarragher commented 1 year ago

Support for nested tags is covered now, in this version of dev branch (latest commit https://github.com/mfarragher/obsidiantools/commit/aff8605003b408db77970f2f6d20005de2cb5454 ).

I will also explore adding tag counts in future commits to that branch.