filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.83k stars 1.26k forks source link

Remove temporary piece file generation from go-fil-markets for OFFLINE deals #4901

Open hannahhoward opened 3 years ago

hannahhoward commented 3 years ago

This is just a place holder for figuring out, once we complete https://github.com/filecoin-project/lotus/issues/4898, a way to not use temporary files for offline deals as well as online deals. I'm not sure the best approach to do this.

dirkmc commented 3 years ago

Currently ImportDataForDeal

Then in the HandoffDeal a Reader is created from the file.

Instead we could change ImportDataForDeal to

From then on handle the deal the same way as if the data had come across the network.

@hannahhoward is it easy to delete the data for a deal if verification fails?

hannahhoward commented 3 years ago

@dirkmc what does "import the deal data directly from the Reader into the deal store" -- do you mean a block store here? the problem here is we have a reader, which is literally just bytes. It's likely a CAR file, but NOT gauranteed to be generated in a way we can replicate in Lotus. We cannot attempt to reproduce a more semantic version of the data here.

My believe is there is no way to avoid a temporary file as long as we use a reader (we can certainly use an OS pipe + TeeReader, to unlock two seperate reads, but then you're just delegating the temp file to the OS). We could certainly change it to a file path. That has the unfortunate down side of leaving us to trust that the person running import does not delete the file in between running the command and getting to the HandoffDeal state.

It is worth mentioning @mikael who can provide more context on why we can't attempt to regenerate the CAR file, and @whyrusleeping who may have insights as the person who wrote the code originally.

Also though, at the end of the day this just may not be a high priority item given it's difficulty.

BTW, we use a thing called a multistore to isolate blocks for a deal from other deals, which means that yes, we can and do delete deal data when verification fails.

ribasushi commented 3 years ago

@hannahhoward you highlighted the wrong @mikeal

On commp calculation - you do not need to create the entire .car on disk, which I believe @magik6k did point out earlier on slack. As long as you have the entire dag in a blockstore already, you simply stream out the car in the usual left-first depth-first seen-first order, and directly calculate commp over the raw .car bytes that stream out of that walk.

jennijuju commented 3 years ago

Seems to be related to https://github.com/filecoin-project/lotus/issues/5291

ribasushi commented 3 years ago

@raulk @dirkmc is this addressed by the current deal-data flow rework?

frrist commented 3 years ago

Did some exploration around this issue as a part of related work. As Hannah has stated above, temporary files are required unless we are comfortable trusting that a user will not modify/delete the file they imported until it has been sealed into a sector. This is due to file importing and sector sealing happening asynchronously -- importing the file is near instance, sealing into a sector takes significantly longer.

Until a solution is decided on the workaround I intended to implement is:

dirkmc commented 3 years ago

@raulk @aarshkshah1992 does the solution outlined by @frrist make sense to you guys or is this something that's going to be covered by the DAG store work?