Start developing end-user API for temporal graphs

An end user API should have at least the following points:

[ ] Patch constructor for each day
[ ] Node features for each patch
[ ] Node renumbering, nodes should be numbered $0 \ldots|V|-1$ where V is the full vertex set, but also need to be numbered $0 \ldots|P_k| -1$ where $P_k \subset V$ is a patch. There should be a way to get back the node numbering in the full vertex set and the node label
[ ] Generate embeddings for each patch using a method such as VGAE. Methods could be abstract classes or functions following a typing.Protocol
[ ] Align embeddings using a EmbeddingMethod (could be local2global, or the new method being developed).

Further processing of embeddings, such as using them for classification is out of scope for this issue.

Sketch

from l2g import make_patch_graph, DataLoader, make_embedding, align_embeddings

# TODO: see what other graph embedding libraries use and try to be compatible
# L2Gv2 should be able to work with any embedding
from l2g.embeddings import VGAEEmbedding

# Local2Global is the old algorithm, ManifoldOptimizer the new one
from l2g.align import Local2Global, ManifoldOptimizer

# Load data
ds = DataLoader('l2gv2/nas')  # loads from web (HuggingFace?)

P = make_patch_graph(ds, patch_identifier: str | V -> str)
vgae = VGAEEmbedding(**kwargs)

# Create embeddings, can use trivial parallelism here (multiprocessing.Pool)
embs: dict[str, np.array] = make_embedding(vgae, P)  # calls emb.fit_transform(P[i]) for patch node i
# ^do node and edge embeddings need to be disambiguated?

# Alignment
aligner = ManifoldOptimizer()

# .fit() could generate the alignment criteria (scaling, orthogonal transformations and translation)
# whereas .fit_transform() applies it. Not clear whether keeping them separate makes sense.
X = aligner.fit_transform(embs)  # X is xarray with node labels

Need to consider how much of this is portable to large graphs (perhaps by using dask and xarray) - should the use of multiprocessors / GPU / cluster be transparent to user which adds complexity or we handle that ourselves (such as using CPU for toy datasets), allowing the user to override as necessary.

OxfordRSE / L2Gv2

Start developing end-user API for temporal graphs #29