hydrusnetwork / hydrus

A personal booru-style media tagger that can import files and tags from your hard drive and popular websites. Content can be shared with other users via user-run servers.
http://hydrusnetwork.github.io/hydrus/
Other
2.27k stars 148 forks source link

Adopt Proper Graph Theory Terminology For Tag Graph #942

Open bbappserver opened 2 years ago

bbappserver commented 2 years ago

I know because of all the legacy and documentation that has to change, modifying the tag graph terminology would be kind of a pain. But I also believe that a lot of time could be saved in explaining to new users or discussing in a technical sense if the terminology was brought in line with how people typically think of trees in a genealogical sense or its mathematical analog. I know the features is based on one particular booru that happened to use those terms... but to put it charitably, that booru was just plain incorrect.

The most intuitive way to describe how the tag graph works is to show a picture, but for that to work you have to talk about nodes in a genealogical sense and then every single time you have to double back and say "but remember it's actually the reverse of that for hydrus".

It is very confusing to discuss the tag graph. The first time any new user wants to use it I have to start by explaining that "parent" means "superset" and sibling means "lexical alias". This runs counter to the way anyone thinks of a lineage, and the opposite of how mathematicians and computer scientists like to think of tree structures.

Then when you try to explain tag resolution you have to mentally invert your model of how a tree works because children of the parent graph are parents in hydrus terminology, and sibling in the parent graph have nothing to do with being a parent(in hydrus terminology) that shares a child(in hydrus terminology) (That is a node that shares a parent in graph theory).

God forbid you need to talk about ancestors and descendants for tag maintenance or a bug report. Because then you really want to talk about how a->{b,c,d}->{e}->{f,g,h} means {b,c,d,e,f,g,h} are descendant of a, nooooo, not in hydrus in hydrus those are all ancestors.

image

children(a)={aa,ab}
descendants(a)={aa,ab,aaa,aab,aac}
siblings(aa)={ab}
ancestors(aaa)={aa,a}
siblings(aab)={aaa,aac}

image

aliases(clark kent)= {character:superman, kal-el,clark kent}
king(clark kent)=character:superman
descendants(clark kent)= descendants(aliases(clark kent) U {clark kent})= descendants ({character:superman, kal-el,clark kent}) = {kryptonian,series:superman,dc comics}
ancestors(dc comics)={aliases(series:superman),aliases(wonder woman), aliases(character:superman)}

Then you can talk about the graph traversal without having to specify everything twice.

def descendants(tagset):
'''Pseudocode explaining tag resolution that doesn't need everything explained twice'''
  if not tagset: return {}
  if len(tagset) == 1 and tagset[0] in descentant_cache: return descendant_cache[tagset]
  alias_set=aliases(tagset)
  child_set=children(alias_set)
  alias_child_set=aliases(child_set).union(child_set)
  desc= descendants(alias_child_set)
  for x in alias_child_set: descendant_cache.put(x,desc)
  return desc.union(child_set).union(alias_child_set)
TheElo commented 2 years ago

What is a tag graph and where/how you even find them in hydrus?

vickyorlo commented 2 years ago

I have no clue what you're trying to say. I think you either don't understand graph theory as well as you think you do, you don't understand what's commonly intuitive, both, or something else. "DC Comics" is not the child of "Superman" but the other way around, in the sense that without DC Comics existing Superman would not exist, and that a child can only have one parent which makes far more sense than a parent being able to have only one child, but a child having many parents. DC Comics is far easier to think of as an ancestor of Superman. Your second graph is directed in the wrong direction. Do note that directed rooted trees can be both in-tree and out-tree and there's nothing "wrong" with either approach.

also what elo said

SwordfishBooks commented 2 years ago

The idea that Superman wouldn't exist without DC Comics is ridiculous.

Firstly, Clark Kent/Superman debuted in 1938's Action Comics 1 published by "National Allied Publications". "DC Comics" is a 1977 rebranding of "National Comics" which is a 1946 merger of "National Allied Publications", "Detective Comics Inc", and "All-American Publications". In short, saying "DC Comics" created Superman is like saying Gogeta is Gohan's father.

Second, Clark Kent/Superman was created before Jerry Siegel and Joe Shuster started working at "National Allied Publications". Therefore, even if "DC Comics" was synonymous "National Allied Publications", the claim that Superman wouldn't exist without "DC Comics" is still false.

bbappserver commented 2 years ago

What is a tag graph and where/how you even find them in hydrus?

The tag graph is not a visually represented UI element, it is in an abstract sense how the tagging engine resolves what are currently called siblings and parents into the virtual tags from the real tags. Let superman be kryptonian and male, then you walk from superman to kryptionan and superman to male in the tag graph to collect the virtual tags which also apply to an image tagged with superman.

Please also ignore whether or not you agree with the example tagging representation of superman, and consider it only as a possible use of siblings and parents used to exemplify how the graph works. I haven't thought it through too deeply I just needed a decent example. Instead I'll let superman be male and kryptonian so we don't have to debate about superman appearing in other works.

To clarify what the images are showing.