ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
15.97k stars 3k forks source link

UrlStore documentation of cache behavoir (when combining with MFS) #6928

Open RubenKelevra opened 4 years ago

RubenKelevra commented 4 years ago

Hey guys there's currently very little documentation available about the experimental function "UrlStore".

I understand that there's no local data cached while adding the file to IPFS, rather a link is pointing to an URL where the data is fetched on demand.

I like to combine this feature with MFS for easy altering of the file/folder structure, while the HTTP server still holds the data.

The response time of a ipfs files cp of a CID stored with UrlStore suggests, that the MFS lazy-mounts the new CID to the local MFS and not pulls the content to the local cache.

The question arises if the --nocopy flag is a general 'no local caching' flag, or if my local cache will start caching the files when they are requested from the IPFS because I've added it to the MFS. While the inclusion of the CID in the MFS will hinder any garbage collection attempts of those files from the cache, basically resulting in a rising cache usage until my local storage is used up (http server holds like 2 magnitudes more storage than I have locally).

Best regards

Ruben

Stebalien commented 4 years ago

I like to combine this feature with MFS for easy altering of the file/folder structure, while the HTTP server still holds the data.

Note: you'll still need to download the data once (throwing it away immediately) to chunk/hash it.

The question arises if the --nocopy flag is a general 'no local caching' flag, or if my local cache will start caching the files when they are requested from the IPFS because I've added it to the MFS.

Under the covers, we create "fake" blocks in a "fake" datastore that refer to slices of files that can be found at known URLs.

We won't ever end up caching those blocks locally because, from IPFS's perspective, we already have them.


Are you interested in writing this up in the "docs/experimental-features.md" file?

RubenKelevra commented 4 years ago

I like to combine this feature with MFS for easy altering of the file/folder structure, while the HTTP server still holds the data.

Note: you'll still need to download the data once (throwing it away immediately) to chunk/hash it.

Sure :)

The question arises if the --nocopy flag is a general 'no local caching' flag, or if my local cache will start caching the files when they are requested from the IPFS because I've added it to the MFS.

Under the covers, we create "fake" blocks in a "fake" datastore that refer to slices of files that can be found at known URLs.

We won't ever end up caching those blocks locally because, from IPFS's perspective, we already have them.

Thanks!

Are you interested in writing this up in the "docs/experimental-features.md" file?

Yes, feel free to assign me :)

RubenKelevra commented 4 years ago

@Stebalien

Question:

What happens if I run something like:

ipfs add --nocopy URL

and some parts of the file match a file already stored? Will the blocks be dropped, and fetched on demand from the URL?

Stebalien commented 4 years ago

and some parts of the file match a file already stored? Will the blocks be dropped, and fetched on demand from the URL?

They'll stay and we won't fetch them from the remote server.

Also, I wasn't quite correct when I said:

We won't ever end up caching those blocks locally because, from IPFS's perspective, we already have them.

If you call ipfs add with neither --nocopy nor --fscache (means: check the filestore before adding), we'll add the blocks to the local blockstore.

RubenKelevra commented 4 years ago

and some parts of the file match a file already stored? Will the blocks be dropped, and fetched on demand from the URL?

They'll stay and we won't fetch them from the remote server.

Also, I wasn't quite correct when I said:

We won't ever end up caching those blocks locally because, from IPFS's perspective, we already have them.

If you call ipfs add with neither --nocopy nor --fscache (means: check the filestore before adding), we'll add the blocks to the local blockstore.

Thanks! Very interesting!

RubenKelevra commented 4 years ago

@Stebalien what happens if I add a file twice, both times with --nocopy and different URLs?

Will the URL be saved as two locations, or will the old URL be overwritten?

Stebalien commented 4 years ago

I believe it the last-used URL will win, but you'd have to check.