Open njtierney opened 3 years ago
Or https://github.com/reside-ic/ids or do something like what reprex did (https://github.com/tidyverse/reprex/blob/d2996e01f045b04cd537653a39deece1025dbf35/R/aaa.R), btu this might not be unique enough.
this is currently being worked on here https://github.com/njtierney/greta/tree/unique-names
There is an issue where an error appears:
Error in distrib_constructor(tf_parameter_list, dag = self) : could not find function "distrib_constructor"
Which means it is not finding
https://github.com/greta-dev/greta/blob/112a96804170d7cccdb76b1f413cdbbb23f0738d/R/dag_class.R#L410
It makes me wonder if perhaps this is related to this issue. We have not been able to reliably develop a small reprex for this issue, so it might not be related to this one.
A note on using hashing like secretbase
, which is what targets uses internall. So as long as the nodes aren't identical, this will work, but if two nodes/R6 objects are identical, they will be identical. So I guess the idea is as long as the input isn't identical, it should be OK.
n_rhex <- 1e6
# generate a random 8-digit hexadecimal string
rhex <- function() paste(as.raw(sample.int(256L, 4, TRUE) - 1L), collapse = "")
many_rhex <- function(x) replicate(n = x, expr = rhex(), simplify = "vector")
rhexes <- many_rhex(n_rhex)
dplyr::n_distinct(rhexes)
#> [1] 999883
dplyr::n_distinct(rhexes) == n_rhex
#> [1] FALSE
many_siphash <- function(n) {
vapply(
X = seq_len(n),
FUN = secretbase::siphash13,
FUN.VALUE = ""
)
}
many_siphashes <- many_siphash(n_rhex)
dplyr::n_distinct(many_siphashes)
#> [1] 1000000
dplyr::n_distinct(many_siphashes) == n_rhex
#> [1] TRUE
Created on 2024-05-28 with reprex v2.1.0
Other alternatives:
https://github.com/coolbutuseless/cryptorng {digest} ?
Some ideas on debugging this.
greta_stash$object_counter <- 0L
# generate a unique name for each node.
rhex <- function() {
count <- greta_stash$object_counter + 1L
greta_stash$object_counter <- count
count
# paste(as.raw(sample.int(256L, 4, TRUE) - 1L), collapse = "")
}
So we get a sense of how many objects are created?
Perhaps this is not likely to happen, or for this to be an issue, but it seems that the rhex() function as defined isn't gauranteed to create a unique name if there are many many nodes (like 1 million).
This is used in node_class.R.
See example below.
Created on 2021-04-08 by the reprex package (v2.0.0)
Perhaps digest or something like https://github.com/coolbutuseless/xxhashlite could be used to give nodes unique IDs