mfarragher / obsidiantools

Obsidian tools - a Python package for analysing an Obsidian.md vault
Other
402 stars 28 forks source link

[FR] Options : choose to use file name / frontmatter title for graph #22

Closed Mara-Li closed 1 year ago

Mara-Li commented 1 year ago

I noticed that the graph created use the filepath, and I want to choose the frontmatter title or the filename instead. How can I do that ?

Graphic reference : image Generated using pyvis

mfarragher commented 1 year ago

This snippet in the demo notebook does use the note names, rather than the filepaths in the graphic:

fig, ax = plt.subplots(figsize=(13,7))
nx.draw(vault.graph, node_color=color_vals, with_labels=True, ax=ax)
ax.set_title('Vault graph')
plt.show()

Via Matplotlib. I won't add functionality for a specific way to do plots to the package - there are so many packages for plotting graphs so I'll only suggest recipes like the one above.

mfarragher commented 1 year ago

If I use the vault in the demo, I get the nodes in the plot with the note names as labels, so I don't get the filepaths like in your graph:

from pyvis.network import Network

nt = Network('500px', '500px')
nt.from_nx(vault.graph)
nt.show('nx.html')

pyviz

mfarragher commented 1 year ago

I couldn't see way to change labels in pyvis, but for networkx plotting this is a way to use frontmatter info as labels in the demo notebook:

# dict for new labels via front matter:
fm_titles_dict = {k:v.get('title', '') for k,v
                  in vault.front_matter_index.items()}

fig, ax = plt.subplots(figsize=(13,7))
nx.draw(vault.graph, node_color=color_vals,
        labels=fm_titles_dict,
        with_labels=True, ax=ax)
ax.set_title('Vault graph')
plt.show()

nx-plot-frontmatter (only 2 of the notes had a title in frontmatter: the other notes apply an empty string)

Mara-Li commented 1 year ago

Thank you for all reply! I will investigate :)

Mara-Li commented 1 year ago

Okay, so I didn't reply here but I discover that the name displayed with the links was page that doesn't exist yet. I updated my workflow to use the filename in these case, but maybe use the basename for them, instead of their path ?

mfarragher commented 1 year ago

Do you have code to produce that behaviour? I don't understand how that happens.

In my demo notebook's graph visualisation via Matplotlib, the notes without a file yet are coloured grey and they use the note name as a label, not a filepath. The visualisation I did in this thread via Pyvis uses the same vault and the graph labels use note names as desired.

These lines are where the graph info is set. All the info in the dict keys and values contain note names.

Mara-Li commented 1 year ago

My code to generate my pyvis :

import os
from pyvis.network import Network
import obsidiantools.api as otools

vault = otools.Vault(os.getcwd()).connect().gather()
graph = vault.graph
net = Network(height="750px", width="750px", font_color="#7c7c7c", bgcolor="transparent")
net.from_nx(graph)
net.save_graph(

I use this "fake vault" here : https://github.com/Lisandra-dev/Lisandra-dev

mfarragher commented 1 year ago

:+1: I can recreate the behaviour via that code. This behaviour mostly matches how graph would appear in Obsidian.

The overall graph isn't accurate for a few reasons, as this package wasn't designed to support vaults with:

For example, most nodes in the graph for Mnémosyne appear as they would in Obsidian. The notes that haven't been created yet are appearing as they do in their wikilinks. However, Sanktae_eldale file appears differently, as the wikilink uses a relative path and the package extracts that.

I've mentioned in the README that this package only supports vaults where each note has a unique name. I don't plan to change that, as supporting situations like wikilinks using relative or absolute paths to keep them unique adds a lot of complexity for little gain.

mfarragher commented 1 year ago

I've updated dev branch (https://github.com/mfarragher/obsidiantools/commit/b3a077f7a5554e419ffa974b1b29dd43de202e3a) to prevent the API from going ahead with the Vault setup when the MD filenames aren't unique. This is what would appear for that vault:

NotImplementedError: obsidiantools is only supported for vaults where each MD filename is unique: 51 unique note names were found from 66 files.
Mara-Li commented 1 year ago

Thanks for the clarification! The graph will continue to be created using this update? I was pretty happy to success to have a graph (at last not 100% ok with Obsidian) in my Material Mkdocs template.

mfarragher commented 1 year ago

What are your wikilink settings? :slightly_smiling_face: These are mine: files-settings

If you're using 'Shortest path when possible', I will explore it. I have an idea of a route to address it for that setting. The other options involving filepaths seem a bit complex for now.

Right now the graph for your example vault isn't represented accurately (only one index.md file will appear) so I think it's best to raise the error in that commit for now.

mfarragher commented 1 year ago

I've created a wiki and added a page for Recipes, to include some of the graph code snippets used here.

mfarragher commented 1 year ago

I had a moment of inspiration and think I've found a way to cover the 'shortest path' behaviour. See this commit in dev: https://github.com/mfarragher/obsidiantools/commit/e349d44a09de9d4ff66013fcadc8061419658493

I think the 'shortest path where possible' behaviour in Obsidian uses the POSIX relative path (whichever OS someone is using).

In the vault example here, I have the 66 MD files show up in the dataframe (none missing now):

<class 'pandas.core.frame.DataFrame'>
Index: 132 entries, README to ../../Compendium/Mnémosyne/(Lagendia) Mnémosyne
Data columns (total 8 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   rel_filepath      66 non-null     object        
 1   abs_filepath      66 non-null     object        
 2   note_exists       132 non-null    bool          
 3   n_backlinks       132 non-null    int64         
 4   n_wikilinks       66 non-null     float64       
 5   n_tags            66 non-null     float64       
 6   n_embedded_files  66 non-null     float64       
 7   modified_time     66 non-null     datetime64[ns]
dtypes: bool(1), datetime64[ns](1), float64(3), int64(1), object(2)
memory usage: 8.4+ KB

and I manually checked the 15 index.md files in Obsidian. These are how the note names appear with this commit:

docs/index
docs/blog/index
docs/hidden/index
docs/Compendium/index
docs/Compendium/Hayleen May/index
docs/Compendium/Ashling May/index
docs/Compendium/Kara Grimalkin/index
docs/Compendium/Mnémosyne/index
docs/Scriptorium/index
docs/Scriptorium/Vélum/index
docs/Scriptorium/Chronique de l'Impérium/index
docs/Scriptorium/Zombie Project/index
docs/Scriptorium/Pentacle déchu/index
docs/Lagendia/index
docs/outils/index
mfarragher commented 1 year ago

I want to tweak my tests and my test vault before closing this issue, but going by the example in this issue I think the package doesn't depend on unique note names now.

Mara-Li commented 1 year ago

Wow! Thanks for this hard work :)

Mara-Li commented 1 year ago

By the way, could you please let me know what path type are supported by the plugin ? As I use it to generate a Mkdocs wiki throught my plugin obsidian publisher, I want to add more option around path (My plugin convert the path).

For the moment, I use "relative" path, not shortest as Mkdocs / Github doesn't like that.

mfarragher commented 1 year ago

The package won't handle [[.foo]] or [[..bar]] properly in the graph. A .foo note would come up in the Vault with its text content, but graph connections won't be right (.foo vs foo as separate nodes).

I haven't used the relative path setting in Obsidian really. Supporting these other modes will be best covered in new test vaults (and probably a Vault kwarg) but for now I'm sticking with 'shortest path when possible' as it's the Obsidian default.

mfarragher commented 1 year ago

I've added test to cover notes with identical filename: https://github.com/mfarragher/obsidiantools/commit/a1d01688e942c6314d805d9cd1badd0b38399287

Mara-Li commented 1 year ago

Little question : Was the update to python 3.9 mandatory?

(I use the plugin running on Netlify, it's sad to heard I can't update it because Netlify sucks with Python :'D)

mfarragher commented 1 year ago

Python 3.9 is needed for the name.removesuffix('.md') part of code I've added recently.

This package requires Pandas and NetworkX - currently they support Python 3.8 and above, but with new releases next year I can imagine those moving to Python 3.9 and higher, so it's also to get ahead of that.