filecoin-project / ref-fvm

Reference implementation of the Filecoin Virtual Machine
https://fvm.filecoin.io/
Other
383 stars 138 forks source link

Machine IDs and `store_artifact` aren't correct #670

Open mriise opened 2 years ago

mriise commented 2 years ago

The idea of Machine IDs were to have a somewhat random number that could identify a machine (epoch + some random number). Ideally we would get some randomness from the network to generate the random number (this was done originally) https://github.com/filecoin-project/ref-fvm/pull/616#discussion_r935063152. Since we can't get randomness from the current epoch, it was scrapped with https://github.com/filecoin-project/ref-fvm/pull/669 and now we use a thread local rand instead.

This ID is only used in the store_artifact syscall currently, so it is fine for now but if it has other uses it would need a different source of randomness or be removed entirely.

Removing ID would require where the artifacts are stored to be structured to prevent clobbering previous machines/actors. Currently the thought is to have the node set env var FVM_STORE_ARTIFACT_DIR to {anything}/{unix timestamp of starting the FVM}/ then have some atomic counter in lotus that passed into the FVM that it uses adds internally to the prefix, with the file name being whatever any actor decides to name it (overwritting). eg /foo/bar/1659483852/3/boxy.cov This would let actors overwrite artifacts from previous actors in the callstack which is (maybe?) bad. So that would need another solution to potentially prevent that (but this was done in the previous implementation as well).

A good Machine ID could replace timestamp and atomic counter, though actor clobbering would still need addressing.

Stebalien commented 2 years ago

Since we can't get randomness from the current epoch, it was scrapped with https://github.com/filecoin-project/ref-fvm/pull/669 and now we use a thread local rand instead.

So, we were trying to draw randomness from the tipset that included the messages being executed. That's actually the parent tipset, so, really, we should just use the previous epoch.

But a counter works just fine as well and may be more robust (in case we flip-flop?).

Stebalien commented 2 years ago

This ID is only used in the store_artifact syscall currently, so it is fine for now but if it has other uses it would need a different source of randomness or be removed entirely.

Why?

Removing ID would require where the artifacts are stored to be structured to prevent clobbering previous machines/actors. Currently the thought is to have the node set env var FVM_STORE_ARTIFACT_DIR to {anything}/{unix timestamp of starting the FVM}/

I would try to reduce the load on lotus here as much as possible.

mriise commented 2 years ago

That's actually the parent tipset, so, really, we should just use the previous epoch.

I am for doing this, as I would prefer to have Machine IDs reproduceable instead of something local to whatever node is running it (if I am understanding how this chain randomness works here, TODO learn to be sure)

This ID is only used in the store_artifact syscall currently, so it is fine for now but if it has other uses it would need a different source of randomness or be removed entirely.

Why?

I want Machine ID to be purely local to a node if it is using local specific (eg thread randum number) values that would seem to come out of thin air to other sources, so instead if Machine ID is shared externally (with the network or otherwise, which might have utility) then I would want the IDs to be derived off of something the network could inspect or at least make sense of.

I would try to reduce the load on lotus here as much as possible.

fair, though i would really like to keep getting a unix timestamp out of the FVM. (not great for portability if included on our side)

Stebalien commented 2 years ago

I want Machine ID to be purely local to a node if it is using local specific (eg thread randum number) values that would seem to come out of thin air to other sources, so instead if Machine ID is shared externally (with the network or otherwise, which might have utility) then I would want the IDs to be derived off of something the network could inspect or at least make sense of.

Ah, got it. That makes sense.

fair, though i would really like to keep getting a unix timestamp out of the FVM. (not great for portability if included on our side).

I'm still not sure where the need for timestamps come from.

mriise commented 2 years ago

I'm still not sure where the need for timestamps come from.

timestamps are helpful if you arent using any RNG and are purely depending on the file structure of store_artifact to prevent machine restarts from clobbering eachother's artifacts

karim-agha commented 2 years ago

I think that having a generic file storage for a blockchain is an awfully bad idea. that's not the way to solve this problem. Blockchains are a very controlled environment and turning them into a dropbox is not the way to go to solve data that you still don't know how to expose properly.

mriise commented 2 years ago

think that having a generic file storage for a blockchain is an awfully bad idea.

This is not the intention store_artifact if that is what you are talking about. Actors need to be able to serialize data to a file to produce code coverage reports. This is only enabled for debugging https://github.com/mriise/ref-fvm/blob/8d1cd10b1f98b9098d9b04795e2e93ca6359e320/fvm/src/syscalls/debug.rs#L33 which shouldn't be enabled on filecoin miners (except maybe in devnets?).

As for MachineId I think it would be interesting to have it network derived, but I don't know of a usecase for this so it is just used locally for now.

maciejwitowski commented 2 years ago

@mriise I think this is M2.2, right?