flux-framework / flux-sched

Fluxion Graph-based Scheduler
GNU Lesser General Public License v3.0
84 stars 39 forks source link

Optimize the JGF portion of RV1 #526

Closed dongahn closed 3 years ago

dongahn commented 4 years ago

Currently the JGF portion of RV1 can become very large. Because it has lots of redundant information, there should be many ways to condense this format. We will first use the vanilla JGF to implement relevant end-to-end capabilities (e.g., scheduler resiliency: Issue #470) and then as part of our planned "optimization phase" we will optimize this format to reduce its storage requirements significantly.

dongahn commented 4 years ago

As a reference, I used gzip to compress a JGF file containing two subsystems, which I used for a JGF reader test. I think the repetition in JGF makes it pretty compressible: the storage reduction in this case was 34x.

-rw-r--r-- 1 ahn1 ahn1 2960742 Oct  2 17:23 power.json
-rw-r--r-- 1 ahn1 ahn1   86309 Oct  2 17:20 power.json.gz
dongahn commented 4 years ago

My current thinking is to come up with reference-resource delta compression.

Observation is: for each resource type (e.g., node or core), each JGF graph vertex only slightly varies from one another. So if a reference graph vertex is created for each type and we encode each JGF vertex against its reference vertex as a delta, we should be able to get a pretty good compression.

For example, from power.json, we have 72 instances of a JGF vertex with nearly identical information:

      {
        "id": "226",
        "metadata": {
          "type": "core",
          "basename": "core",
          "name": "core1",
          "id": 1,
          "uniq_id": 226,
          "rank": -1,
          "exclusive": false,
          "unit": "",
          "size": 1,
          "paths": {
            "containment": "/power0/rack0/node0/socket0/core1"
          }
      {
        "id": "262",
        "metadata": {
          "type": "core",
          "basename": "core",
          "name": "core1",
          "id": 1,
          "uniq_id": 262,
          "rank": -1,
          "exclusive": false,
          "unit": "",
          "size": 1,
          "paths": {
            "containment": "/power0/rack0/node1/socket0/core1"
          }

So if the first JGF vertex was the reference, we can encode the rest of 72 instances only with a handful of differentials.

ahn1@794a6a1e9347:/usr/src/t/data/resource/jgfs$ diff reference.json compare.json
2c2
<         "id": "226",
---
>         "id": "262",
8c8
<           "uniq_id": 226,
---
>           "uniq_id": 262,
14c14
<             "containment": "/power0/rack0/node0/socket0/core1"
---
>             "containment": "/power0/rack0/node1/socket0/core1"

Some thoughts will be needed to minimize the encoding for each vertex's paths in its belonging subsystems. But then this is also highly structured, so we can go a long way by introducing some indexing. (BTW, power0 is the cluster name! :-)

Maybe, we can build a "canonical" abstract paths map that only consists of resource basenames or types (not IDs): power, power->rack, power->power->rack->node->socket->core etc, then encode each vertex's ID sequence onto this map like 0,0,0,0,1 for the first case and 0,0,1,0,1 for the second case. We can even apply some run-length encoding for such a sequence too.

If we are even more motivated, we can recursive apply reference vertex encoding such that even encoding a reference vertex will be built against some sort of "master" reference vertex.

I also need to think about how to compress edges.

      {
        "source": "0",
        "target": "1",
        "metadata": {
          "name": {
            "containment": "contains"
          }
        }
      {
        "source": "0",
        "target": "2",
        "metadata": {
          "name": {
            "containment": "contains"
          }
        }

This is again highly repetitive so the same technique should apply.

Given the "structured" nature of JGF data, we should be able to get an extremely high compression rate.

dongahn commented 4 years ago

Another reference: power.graphml is the GRUG version of power.json above. In fact, I populated the resource graph using power.graphml and write it to the JGF.

-rw-r--r-- 1 ahn1 ahn1 8289 Jun 15 01:21 power.graphml
-rw-r--r-- 1 ahn1 ahn1 1315 Jun 15 01:21 power.graphml.gz

GRUG version is over 10x more storage efficient than gzipped JGF as expected. The gzipped GRUG is over 65x more efficient.

dongahn commented 4 years ago

Here is the very coarse theoretical analysis on the proposed resource reference delta compression.

Let V be the storage requirement of a JGF resource vertex and E be the storage requirement of a JGF edge.

And let |V| and |E| be the numbers of vertices and edges and |T| be the number of unique resource types.

Say, the storage requirement of recording the delta between the master reference vertex and a type reference vertex is D and that of recording the delta between the type reference vertex and an actual vertex of that type is D'. Finally, say the storage requirement of recording the delta between the reference edge and an actual edge is D''.

Then, the overall storage requirement of our resource reference delta compression will be given by:

V + |T|xD + |V|xD' + E + |E|xD''

where V and E are the size required to represent the master resource and edge references.

Given that the current scheme's storage requirement is given by

|V|xV + |E|xE

At large scale, the compression rate will be converged to

(|V|xV + |E|xE) / (|V|xD' + |E|xD'')

This means the less the storage requirements you make for D' and D'', the higher compression rate and the more scalable your scheme is going to be.

Then, once our compression is done, we can further apply gzip once gain to do a further low-level compression.

For Sierra scale systems with O(1M) graph entities (vertices and edges), my gut feeling tells me the JGF compression will give you on the order of a few megabytes.

If this is still large, we will need to take advantage of range-based compression to further minimize this term: |V|xD' + |E|xD''. Seems possible but this is complex enough, at this point we will need some hands-on for feasibility.

I feel this is a pretty interesting topic, and if we can pull this off, we should be able to claim we generally advanced HPC system's research.

dongahn commented 4 years ago

@grondo: Now that I think about this, we may be able to generalize this reference delta compression to apply that to MPIR Proctable compression as MPIR also has lots of repetition.

It would have one master reference like

int pid = 00000
hostname = "cab"
exec_name = "app"

We can encode a node reference against this master. And the rest of proctable entries in that node can be encoded against that the node reference which will result in a delta that only consists of pids.

The main difference of this and JGF is that JGF nodes and edges keys are considered "unordered". It doesn't matter which element in either nodes or edges key appears so we can sort them to maximize the use of range compression.

But MPIR is ordered (by MPI_COMM_WORLD order). So we can't sort the deltas to maximize range-based compression. However, since PIDs on each node would typically be consecutive, even without sorting, a range compression would be possible.

dongahn commented 4 years ago

BTW, I think the main difference between this and other general purpose delta encoding (being used in tools like gzip) would be that the client provides those multilevel "references". Will see if providing domain specific information to delta encoding will address our needs...

dongahn commented 4 years ago

More data points:

Without whitespaces, JGF outputs are significantly smaller so for the above power.json is reduced from ~2.9MB to ~1.3MB. On it, gzip provides 18x compression. xz has two modes of compression. At the regular mode, it provides 43x compression and the extreme mode (which uses far more CPU time) provides 65x compression rate.

-rw-r--r--  1 ahn1 ahn1   1275608 Oct  4 21:18 power.nw.json
-rw-r--r--  1 ahn1 ahn1     19580 Oct  4 21:20 power.nw.json.e.xz
-rw-r--r--  1 ahn1 ahn1     70354 Oct  4 21:18 power.nw.json.gz
-rw-r--r--  1 ahn1 ahn1     29300 Oct  4 21:19 power.nw.json.xz
dongahn commented 4 years ago

Sierra size JGF's storage requirement (without whitespaces) is: 129MB. gzip 20x compression and xz 60x. xz -e for extreme mode is just prohibitively slow for a large case like this. With xz, it is about 2MB.

-rw-r--r--  1 ahn1 ahn1 129749709 Oct  4 21:14 sierra.json.cp
-rw-r--r--  1 ahn1 ahn1   6180181 Oct  4 21:12 sierra.json.cp.gz
-rw-r--r--  1 ahn1 ahn1   2127664 Oct  4 21:13 sierra.json.cp.xz
garlick commented 4 years ago

Fwiw, content store does compression on content blobs before they hit disk.

On Fri, Oct 4, 2019, 2:29 PM Dong H. Ahn notifications@github.com wrote:

Sierra size JGF's storage requirement (without whitespaces) is: 129MB. gzip 20x compression and xz 60x. xz -e for extreme mode is just prohibitively slow for a large case like this. With xz, it is about 2MB.

-rw-r--r-- 1 ahn1 ahn1 129749709 Oct 4 21:14 sierra.json.cp -rw-r--r-- 1 ahn1 ahn1 6180181 Oct 4 21:12 sierra.json.cp.gz -rw-r--r-- 1 ahn1 ahn1 2127664 Oct 4 21:13 sierra.json.cp.xz

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/flux-framework/flux-sched/issues/526?email_source=notifications&email_token=AABJPW5HIKOYWGGGC4CPCK3QM6YVPA5CNFSM4I4PXKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAM54XI#issuecomment-538566237, or mute the thread https://github.com/notifications/unsubscribe-auth/AABJPW6QM77JEKCBQSBMD3DQM6YVPANCNFSM4I4PXKZQ .

dongahn commented 4 years ago

What compression library are you using? Zlib?

garlick commented 4 years ago

Lz4

On Fri, Oct 4, 2019, 2:44 PM Dong H. Ahn notifications@github.com wrote:

What compression library are you using? Zlib?

From: Jim Garlick notifications@github.com Reply-To: flux-framework/flux-sched reply@reply.github.com Date: Friday, October 4, 2019 at 2:43 PM To: flux-framework/flux-sched flux-sched@noreply.github.com Cc: Dong Ahn ahn1@llnl.gov, Author author@noreply.github.com Subject: Re: [flux-framework/flux-sched] Optimize the JGF portion of RV1 (#526)

Fwiw, content store does compression on content blobs before they hit disk.

On Fri, Oct 4, 2019, 2:29 PM Dong H. Ahn notifications@github.com wrote:

Sierra size JGF's storage requirement (without whitespaces) is: 129MB. gzip 20x compression and xz 60x. xz -e for extreme mode is just prohibitively slow for a large case like this. With xz, it is about 2MB.

-rw-r--r-- 1 ahn1 ahn1 129749709 Oct 4 21:14 sierra.json.cp -rw-r--r-- 1 ahn1 ahn1 6180181 Oct 4 21:12 sierra.json.cp.gz -rw-r--r-- 1 ahn1 ahn1 2127664 Oct 4 21:13 sierra.json.cp.xz

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/flux-framework/flux-sched/issues/526?email_source=notifications&email_token=AABJPW5HIKOYWGGGC4CPCK3QM6YVPA5CNFSM4I4PXKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAM54XI#issuecomment-538566237>,

or mute the thread < https://github.com/notifications/unsubscribe-auth/AABJPW6QM77JEKCBQSBMD3DQM6YVPANCNFSM4I4PXKZQ>

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub< https://github.com/flux-framework/flux-sched/issues/526?email_source=notifications&email_token=AAGSPK6XWLKW6BK3TKYWSMDQM62I7A5CNFSM4I4PXKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAM6Y5Q#issuecomment-538569846>, or mute the thread< https://github.com/notifications/unsubscribe-auth/AAGSPK6PVKDTKCRBC7LX5H3QM62I7ANCNFSM4I4PXKZQ>.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flux-framework/flux-sched/issues/526?email_source=notifications&email_token=AABJPWYM6JWPGPCPSESGRUTQM62MZA5CNFSM4I4PXKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAM623A#issuecomment-538570092, or mute the thread https://github.com/notifications/unsubscribe-auth/AABJPW5PAWC46CUMWTH5X4TQM62MZANCNFSM4I4PXKZQ .

dongahn commented 4 years ago

@grondo asked for some simple examples. JGF examples I have been thinking about:

Say you have only two resource types: node and core. The nodes list of JGF contains entries like:

[
      {
        "id": "9",
        "metadata": {
          "type": "node",
          "basename": "node",
          "name": "node0",
          "id": 0,
          "uniq_id": 9,
          "rank": -1,
          "exclusive": false,
          "unit": "",
          "size": 1,
          "paths": {
            "containment": "/power0/rack0/node0",
            "power": "/powerpanel0/pdu0/node0"
          }
        }
      },
      {
        "id": "10",
        "metadata": {
          "type": "node",
          "basename": "node",
          "name": "node1",
          "id": 1,
          "uniq_id": 10,
          "rank": -1,
          "exclusive": false,
          "unit": "",
          "size": 1,
          "paths": {
            "containment": "/power0/rack0/node1",
            "power": "/powerpanel0/pdu0/node1"
          }
        }
    }
]

And

[{
    "id": "225",
    "metadata": {
        "type": "core",
        "basename": "core",
        "name": "core0",
        "id": 0,
        "uniq_id": 225,
        "rank": -1,
        "exclusive": false,
        "unit": "",
        "size": 1,
        "paths": {
            "containment": "/power0/rack0/node0/socket0/core0"
        }
    }
}, {
    "id": "226",
    "metadata": {
        "type": "core",
        "basename": "core",
        "name": "core1",
        "id": 1,
        "uniq_id": 226,
        "rank": -1,
        "exclusive": false,
        "unit": "",
        "size": 1,
        "paths": {
            "containment": "/power0/rack0/node0/socket0/core1"
        }
    }
}]

We can transform this to a simple one level reference delta scheme.

"refs" : {
    "node":  {
               "id": "9",
               "metadata": {
                 "type": "node",
                 "basename": "node",
                 "name": "node0",
                 "id": 0,
                 "uniq_id": 9,
                 "rank": -1,
                 "exclusive": false,
                 "unit": "",
                 "size": 1,
                 "paths": {
                   "containment": "/power0/rack0/node0",
                   "power": "/powerpanel0/pdu0/node0"
                 }
               }
         }
     },
     "core": {
              "id": "226",
              "metadata": {
                "type": "core",
                "basename": "core",
                "name": "core1",
                "id": 1,
                "uniq_id": 226,
                "rank": -1,
                "exclusive": false,
                "unit": "",
                "size": 1,
                "paths": {
                  "containment": "/power0/rack0/node0/socket0/core1"
                }
              }
        }
    }
{
"nodes": {
      "ref": "node",
      "deltas": [
      {
      },
      {
          "id": "10",
          "medtadata": {
              "name": "node1",
              "id" : 1,
              "uniq_id": 10,
              "paths": {
                   "containment": "/power0/rack0/node1",
                   "power": "/powerpanel0/pdu0/node1"
               }
          }
      }
     ]
},
"cores": {
      "ref" : "core",
      "deltas" : [
      {
      },
      {
          "id": "226",
          "medtadata": {
              "name": "core1",
              "id" : 1,
              "uniq_id": 226,
              "paths": {
                   "containment": "/power0/rack0/node0/socket0/core1”
               }
          }
      }
]
}
}

The deltas would be more friendly for further keywide compression like a range compression.

So, assuming we can generate id and uniq_id consecutively (or at least with a well-know pattern) for each type and say you have a Sierra class 4320 compute nodes. This can be further condensed to.

"nodes": {
      "ref": "node",
      "deltas":
      {
          "id": "11-4342",
          "medtadata": {
              "name": "node[1-4230]",
              "id": "1-4230",
              "uniq_id": "11-4342",
              "paths": {
                  "containment": "/power0/rack0/node[1-18],/power0/rack1/node[19-36]...",
                  "power": "/powerpanel0/pdu0/node[1-18],/powerpanel0/pdu0/node[1i9136..."
              }
      }
}
{
"cores":
      {
          "id": "226-190306",
          "medtadata": {
              "name": "core[0-43]*4320",
              "id" : "[0-43]*4320",
              "uniq_id": "226-190306",
              "paths": {
               }
          }
      }
}

Two key insights:

dongahn commented 4 years ago

For MPIR example, say on a node (sierra23), the job shell generated its local MPIR proctable:

[
{
    "mpi_rank": 256,
    "pid": 23451,
    "hostname": "sierra23",
    "executable_name": "app"
},
{
    "mpi_rank": 257,
    "pid": 23452,
    "hostname": "sierra23",
    "executable_name": "app"
}
...
]

This can be condensed:

{
    "refs" : {
        "sierra23" : {
            "mpi_rank": 256,
            "pid": 23451,
            "hostname": "sierra23",
            "executable_name": "app"
       }
    },
    "proctabs": {
         "ref" : "sierra23",
         "deltas": {
             "mpi_rank": "256-300",
             "pid": "23451-23495",
         }
    }
}

When this is globally merged, we can apply the same transformation algorithm to the list of keys in "refs". In this case, though, one concern is we will end up having a list of locally compressed list in proctabs key whereas a better solution is to further perform keywide compression after the merging.

In any case, if the raw proctable is gathered, and one level reference delta encoding is used instead, you may get:

{
    "refs" : {
        "entry" : {
            "mpi_rank": 256,
            "pid": 23451,
            "hostname": "sierra23",
            "executable_name": "app"
       }
    },
    "proctabs": {
         "ref" : "entry",
         "deltas": {
             "mpi_rank": "0-484899",
             "pid": "23451-23495", "24994-484848..."
             "hostname": "sierra[23-39],sierra[40-41]..."
         }
    }
}

In this case, since we can't control the pid and hostname that are consistent with the primary sorting order ("mpi_rank"), the compression rate and techniques we can use will be a bit more limited than the JGF case.

dongahn commented 4 years ago

If we can only make it such that the condensed delta entries can recursively be used as a reference again, maybe we can do better with a distributed case as well...

{
"lowest": 256,
"mpir": [
{
    "lowest": 256,
    "relative_rank": 0,
    "pid": 23451,
    "hostname": "sierra23",
    "executable_name": "app"
},
{
    "lowest": 256,
    "relative_rank": 1,
    "pid": 23452,
    "hostname": "sierra23",
    "executable_name": "app"
}
...
]
}
{
    "refs" : {
        "sierra23" : {
            "lowest": 256,
            "relative_rank": 0,
            "pid": 23451,
            "hostname": "sierra23",
            "executable_name": "app"
       }
    },
    "proctabs": {
         "ref" : "sierra23",
         "deltas": {
             "relative_rank": "0-44",
             "pid": "23451-23495”,
         }
    }
}

There would be many nodes whose relative_rank will be

 "relative_rank": "0-44",
dongahn commented 4 years ago

For common cases PIDs can be condensed this way too?

"relative_pid": "0-44"

and only keep the list of lowest pid of each node.

dongahn commented 4 years ago

If I were to manually compress 32 node, fully populated MPIR proctable, using all the intuition above, this would be similar to the following. (Just try to see if there would a general compression algorithm/mechansim which can approximate this.

"hostname": "sierra[1-10],sierra[50-59],sierra[321-330],sierra2,sierra1004"
"lowest_mpi_rank": "0,44,88,..."
"relative_mpi_rank": "[0-43]x32"
"lowest_pid": "[24344]x44,[33445]x44,[98822]x44,…"
"relative_pid": "[0-43]x32"

Then, fields like hostname an lowest_mpi_rank can be further compressed. For example, lowest_mpi_rank is

{ "base": 0, "operator": "+", "operand": 44, "max": 1408}

"lowest_pid" would only be the field that is not compressible. But its count is the number of nodes, not number of MPI processes.

grondo commented 4 years ago

In the example above, how do you map the multiple pids/ranks from each hostname?

Why is lowest_pid x44 and relative_pid and relative_mpi_rank x32?

I have to admit I'm not really following this scheme. If you were to "unpack" the Proctable for sierra1, how would you go about it?

grondo commented 4 years ago

BTW, we don't have "hostlist" support any more in Flux, we may want to think of another way to create compressed lists, e.g. a common hostname "prefix" and then a JSON "range list"?

{ "host": {
    "prefix": "sierra",
    "suffix_list": [ [1, 10], [50-59] ] 
}

or similar?

dongahn commented 4 years ago

In the example above, how do you map the multiple pids/ranks from each hostname?

Why is lowest_pid x44 and relative_pid and relative_mpi_rank x32?

Sorry I needed to be more step by step. Say, on one hostname, you have 44 MPI processes. The basic set of transformations I am thinking about:

no reduction

{
    "hostname": "sierra1",
    "lowest_mpi_rank": 0,
    "relative_mpi_rank": 0,
    "lowest_pid": 24344,
     "relative_pid": 0,
},
{
    "hostname": "sierra1",
    "lowest_mpi_rank": 0,
    "relative_mpi_rank": 1,
    "lowest_pid": 24344,
     "relative_pid": 1,
},
...
{
    "hostname": "sierra1",
    "lowest_mpi_rank": 0,
    "relative_mpi_rank": 43,
    "lowest_pid": 24344,
     "relative_pid": 43,
}

Then,

node local reduction

{
    "hostname": "sierra[1]x44",
    "lowest_mpi_rank": "[0]x44",
    "relative_mpi_rank": "[0-43]",
    "lowest_pid": "[24344]x44",
     "relative_pid": "[0-43]"
}

global reduction

As part of global reduction across 32 compute nodes, you would see 32 instances of [0-43] for both relative ranks. So I used [0-43]x32 to denote that reduction.

"hostname": "sierra[1-10],sierra[50-59],sierra[321-330],sierra2,sierra1004"
"lowest_mpi_rank": "0,44,88,..."
"relative_mpi_rank": "[0-43]x32"
"lowest_pid": "[24344]x44,[33445]x44,[98822]x44,…"
"relative_pid": "[0-43]x32"
dongahn commented 4 years ago

BTW, we don't have "hostlist" support any more in Flux, we may want to think of another way to create compressed lists, e.g. a common hostname "prefix" and then a JSON "range list"?

Sure. One thing we have to think about is that we will probably want MPI_COMM_WORLD as the primary sorting order so that we need to support ways to still compress out of order list like host suffices. Seems "range list" will do this fine though.

grondo commented 4 years ago

Thanks @dongahn! That is much more clear.

However, is the straightforward reduction just as good if not better for most cases?

node local

{
    "hostname": "sierra1",
    "ranks": "0-43",
    "pids": "24344-2387"
}   

global reduction

{ 
    "hostname": "sierra[1-10],sierra[50-59],sierra[321-330],sierra2,sierra1004",
    "ranks": "0-1407",
    "pids": "24344-2387,33445-33489,..."
}       

Of course in reality we would need to replace these hostlist range strings with a range list "pids": [ [ 24344, 2387 ], [ 33445, 33489 ], ...] and maybe that has an impact as well.

Maybe we could devise a "custom" json object format to concisely describe the relative delta format you have above. E.g. an array of possibly repeated ranges could be encoded something like [ [ start, end, repeat ], [ ... ] ], where end and repeat are optional. As a convenience, if an entry is not an array it can be taken to be the same as [ [ N, 0, 1 ] ], where an end value of 0 indicates end == start.

E.g. To encode [0-43]x32 you would use [ [ 0, 43, 32 ] ], and a list of numbers 0, 44, 88 would be encoded as simply [ 0, 44, 88 ].

Then you can use two of these arrays to encode your reference delta, e.g, the mpi ranks from above:

{ "lowest_mpi_rank": "0,44,88,...",
   "relative_mpi_rank": "[0-43]x32" }

could be encoded as

[ [ 0, 44, 88, ... ], 
  [ [ 0, 43, 32 ] ] 
]

Doing the same for the pids, your globally reduced proctable might look like:

{  "hosts": "sierra[1-10],sierra[50-59],sierra[321-330],sierra2,sierra1004",
    "ranks": [ [ 0, 44, 88, ... ],[ [ 0, 43, 32 ] ] ],
    "pids": [ [ [2344, 0, 32], [33445, 0 32], ...], [ 0, 43, 32 ] ]
}
dongahn commented 4 years ago

However, is the straightforward reduction just as good if not better for most cases?

Nearly. I think you can get some more gain from pid reduction. And having [0-43]x32 would have slight advantages if we want to apply a general low-level compression to it further like gzip or xz.

Note that, though, this came from my attempt to try to fit this into my "conceptual" reference delta compression framework as a general tool in the beginning.

dongahn commented 4 years ago

@grondo: Yeah, the rest of your posting is really aligned with what I have been thinking about!!

In a layman's notation:

[1-n] would be equal to [1-n]:1, meaning 1 through n with a stride of n [1-n]:m would be equal to [1-n]:1, meaning 1 through n with a stride of m [1-n]xk would be the range [1-n] repeat k times.

And there could be other patterns that can further help reduce the space for delta encoding. We can determine those patterns using some layman's notation first and then formalize our format using something like you proposed.

Good thinking!

dongahn commented 3 years ago

We will see if some of the ideas discussed here can help designing the compression schemes of RV2.