ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
15.83k stars 2.96k forks source link

nocopy option doesnt work without moving files (defeating part of the purpose) #4224

Open PCSmith opened 6 years ago

PCSmith commented 6 years ago

Version information: 0.4.10

Type: Bug / Implementation Flaw

Severity: high

Description:

The "filestore" capability should be considered baseline for this project IMO. The expectation cant be to duplicate all bytes you want to place on IPFS by keeping a copy in the datastore... Nor can it be to expect users to throw their existing directory organisational structure out the window to manually copy IPFS candidate files to a central location.

Need the ability to --nocopy any file in any location and have that file not move or be copied anywhere. Also, if this --raw-leaves thing is needed for this it should be done by default. Not sure if it is though.

Thanks for your work on this world changing project.

whyrusleeping commented 6 years ago

@PCSmith ipfs can only add files within its directory context to the filestore as a security measure. Think of it like a git repository. If i remember correctly, symlinks work fine, so you can symlink your /datamount into your homedir and use the filestore from there.

kevina commented 6 years ago

Note that the .ipfs default location is the home directory. This exposes the entire home directory that is likely to contain a lot of sensitive information. Thus, at least with the current setup, I do not see this is a very convincing argument. Apologies if I am missing something obvious.

PCSmith commented 6 years ago

A few thoughts: I'm not linux right now, as with the majority of your likely intended audience for this project, I'm Windows. The "home" directory isnt where most data is kept. I keep nothing in there in fact, and dont care to as thats on my SSD.

My use case is making a several hundred videos from my youtube channel accessible via IPFS. The video files are dispersed through a file system meant to organize them with their other assets (clips, adobe premiere files, audio, etc). Copying them would be a ridiculous waste of space. Moving them all away from their support structure and accompanying assets would be more than inconvenient.

I do not argue the security concerns and I'm glad you guys are keeping an eye on it. Though exceptions accepted through positive actions should be possible for usability. And please dont put Windows last on your list of considerations. I'd argue it should be near the front for adoptions sake, not because I dont use and love linux.

If symlinking my video tree into the ipfs path will work for this I will def give it a shot. Thanks guys!

PCSmith commented 6 years ago

Before I spend the time deleting my store and recreating everything let me make sure I understand how this will work on Windows.

The IPFS executable existes in a folder on drive X. Lets say X:\IPFS\IPFS.exe which has been added to my paths. The data store through the IPFS_PATH environment variable exists in X:\IPFS\Datastore. If I symlink my youtube video tree in on X:\IPFS\YouTubes then filestore can work with those files without copying or moving them?

whyrusleeping commented 6 years ago

@PCSmith Hey, thanks for the feedback. Getting feedback and people pushing for better windows support definitely helps us prioritize things.

The symlink as you describe it should work. Let me know if you run into any issues, I have a windows VM around now and should be able to help debug.

PCSmith commented 6 years ago

Is raw-leaves required?

djdv commented 6 years ago

Random anecdote, I've added over a terabyte of data across many different files and folders via nocopy, utilising symlinks in the IPFS_PATH without issues on Windows (outside of issues relating to filestore commands that aren't implemented yet). I made a shell extension for Windows that makes a symlink of the target's parent folder and this is the primary method I use for now.


2021 edit: Key formats seem to have changed and I still get questions about this project. The last published version is at /ipfs/zDMZof1m1fX98cTLyC2VLe9iDQQhWgDLu5foshBSsxSWHQNuiyYV and the IPNS key is now /ipns/k51qzi5uqu5di8iluwqo958r5wf6vw7imzfww3zg1gi7br27ze7h3k93ddisr8 (it's the same keyfile after I imported it with ipfs key import from the old /ipns/QmaUgENG66kp6cyYUoiKREJWRaaQZmFt7EfFEnoMN1UvJZ key)

I haven't maintained this but if you replace the bundled ipfs.exe it probably still work. If you find it useful and want me to update it, reach out to me and I'll try to fix up the code so that the binary works and can release the code with it. Or consider IPFS Desktop.

whyrusleeping commented 6 years ago

@PCSmith yes, raw leaves is required. In the near future we will be defaulting that option to true for normal adds as well.

PCSmith commented 6 years ago

crap -- I added my entire library without raw-leaves. Do I need to clean and redo? What happens without it?

whyrusleeping commented 6 years ago

If youre using the --nocopy option, it turns --raw-leaves on automatically for you.

PCSmith commented 6 years ago

oh! perfect. thanks. You can close this out I suppose. But I think my criticism still stands if your target audience is your typical computer user. Symlinks and environment variables are probably not going to work for them. If thats not the audience then disregard. I'm just having trouble figuring how this is going to mainstream without laymen being able to easily seed / pin things / publish things.

The example is Dtube -- most of these people are uploading videos without realizing that unless their vids pay that dude enough to keep their files pinned on his hosting platform that their videos are not going to live long unless they're running their own node and pinning everything themselves. Because the browser version of ipfs cant exactly pin most videos (50 mb limit right?), and even if it could it would only be running while that page was open. I guess we're OT at this point. Just rambling.

Is there someone available for the project to be interviewed on my channel? I'd love to promote the project.

PCSmith commented 6 years ago

btw djdv -- it is so freaking awesome that you were able to deploy that site with video and download to IPFS. Loving the possibilities here.

whyrusleeping commented 6 years ago

But I think my criticism still stands if your target audience is your typical computer user

Well, the typical computer user won't be using the command line either. When we have a nicer user interface for ipfs, this sort of thing could be more easily automated and hidden away from the user.

and even if it could it would only be running while that page was open

Not true actually, using an ipfs service worker, you could have a js-ipfs node running in the background being used by any website that needs it.

Is there someone available for the project to be interviewed on my channel?

I would be interested, but i'm going to be traveling for a few weeks so it might be difficult.

PCSmith commented 6 years ago

When we have a nicer user interface for ipfs, this sort of thing could be more easily automated and hidden away from the user.

Is anyone working on that or is that an area I might contribute?

and even if it could it would only be running while that page was open

oh yeah! I forgot about those.

I would be interested, but i'm going to be traveling for a few weeks so it might be difficult.

Awesome. Let me know how or who I can get in touch with my producer to set it up?

whyrusleeping commented 6 years ago

Is anyone working on that or is that an area I might contribute?

there are a lot of different projects, but nothing that nice yet. I think the biggest issue is nobody really knows what is needed or wanted. Who are the users? What are the use cases? how do we best support that, etc.

Let me know how or who I can get in touch with my producer to set it up?

Grab my email from a commit (spam avoidance)

kevina commented 6 years ago

@PCSmith can I have a link to your channel? Your GitHub profile doesn't say much about you.

PCSmith commented 6 years ago

I emailed you a while back why...

Channel is here: https://steemit.com/@disenthrall https://dtube.video/#!/c/disenthrall https://www.youtube.com/DisenthrallMe https://www.facebook.com/Disenthrall/

whyrusleeping commented 6 years ago

Hey, I got your email. Just been moving around a lot without stable internet since then

On Sun, Oct 8, 2017, 7:17 AM Patrick notifications@github.com wrote:

I emailed you a while back why...

Channel is here: https://steemit.com/@disenthrall https://dtube.video/#!/c/disenthrall https://www.youtube.com/DisenthrallMe https://www.facebook.com/Disenthrall/

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ipfs/go-ipfs/issues/4224#issuecomment-334981952, or mute the thread https://github.com/notifications/unsubscribe-auth/ABL4HJUYLskhjX_W9h4dQ4rEU7xZtbMWks5sqEzhgaJpZM4PUJyC .

PCSmith commented 6 years ago

no worries. I look forward to the talk. :)

2 side questions:

  1. How do I update a nocopy file in ipfs when I move it?

  2. Is there a way to search IPFS to see if a file has been nocopy pinned?

kevina commented 6 years ago
  1. How do I update a nocopy file in ipfs when I move it?

You can't right now please see https://github.com/ipfs/go-ipfs/issues/4260

  1. Is there a way to search IPFS to see if a file has been nocopy pinned

The best you can do now is ipfs filestore ls to get the contents of the filestore this will only list the leafs not the pinned roots.

Voker57 commented 6 years ago

@whyrusleeping could you please explain the security importance of root restriction, from which attack vector does it protect?

renich commented 6 years ago

I am typing randomly here but, I think this is solvable in a not-so-convoluted way by:

# create systemd service unit
cat << 'EOF' > /etc/systemd/system/ipfs@.service
[Unit]
Description=InterPlanetary File System
After=network.target

[Service]
ExecStart=/usr/local/bin/ipfs daemon --enable-gc --migrate
ExecStop=/usr/local/bin/ipfs shutdown
Group=%i
Restart=always
Type=simple
User=%i

[Install]
WantedBy=multi-user.target
EOF

# create a user for this purpose
useradd --create-home --home-dir=/var/lib/ipfs/ --system --shell=/bin/bash ipfs

# login as that user
su - ipfs

# init
ipfs init

# create relevant dirs
## mounts
mkdir -m 2770 mounts
mkdir -m 2770 mounts/{foo,bar}

## ipfs and ipns
mkdir -m 2770 ipfs ipns

# configure ipfs
## enable filestore
ipfs config --bool Experimental.FilestoreEnabled true

## set ipfs and ipfs mount points
ipfs config Mounts.IPFS $( pwd -P )/ipfs
ipfs config Mounts.IPNS $( pwd -P )/ipns

# exit ipfs user
exit

# go back to the user
su - ipfs

# check peers
ipfs swarm peers

# mount whatever directories you want
## bind existing directories. You could, also, add the entry at /etc/fstab:
## /home/renich/foo /var/lib/ipfs/mounts/foo none bind
## note: remember that the directory has to be readable by the ipfs user now. It's entirely up to you how you do this. I can think of: 
## * common group between users and ipset
## * ACLs
## * bindfs UID and GID mapping
## * add the ipfs user to the user's group (not recommended but pretty much how it currently works)
mount -o bind /home/renich/foo ~ipfs/mounts/foo

## mount a drive
## you could, also, add it to /etc/fstab
## /dev/sdXi /var/lib/ipfs/mounts/bar btrfs defaults
mount /dev/sdXi ~ipfs/mounts/bar

# exit ipfs user
exit

# start the daemon
systemctl start ipfs@ipfs.service

# back to ipfs user
su - ipfs

# add stuff
ipfs add --progress --recursive --nocopy $HOME/mounts/foo
ipfs add --progress --recursive --nocopy $HOME/mounts/bar

I mean, it's not as easy as curl some-script | bash but it works more less.

ghost commented 6 years ago

Hey @Renich please dont use IPFS for anything that could be regarded as copyright infringement. I've edited your comment slighty. For more info see https://github.com/ipfs/community/blob/master/code-of-conduct.md

Thanks for the script though, very nice.

renich commented 6 years ago

@lgierth sure thing. Just joking a bit. ;) Won't happen again.

renich commented 6 years ago

@lgierth btw, you missed a few. Updated again.

PCSmith commented 6 years ago

You are attempting to police content available on IPFS now? interesting... noted.

-Patrick Intellectual property is not a valid form of property.

whyrusleeping commented 6 years ago

Nope, not policing content on ipfs. Just the community forums that we spend our time maintaining for the sake of the community.

iain17 commented 6 years ago

I'm wanting to do something similar. I have a use case where every peer in my IPFS network shares a certain directory with files up to 10gb or more. Where this directory is located differs per user. I'm a little confused as to what --nocopy actually does now.... How can I add these files to IPFS without duplicating each file to the ~/.ipfs/ home directory?

The app will mostly be used on Windows so I doubt the symlink solutions discussed in this issue are available right?

Perhaps a possible solution would be to add a option to add hashes of the files to the network, telling the network that this peer has the files. Then when a peer asks for a hash which IPFS can't find in its own filestore, we have a callback to return a io.Reader of different locations where it could... Giving the end user some flexibility.

Voker57 commented 6 years ago

@iain17 if users store data in their $HOME, they can use --nocopy, if not, they can add a symlink to their $HOME.

I'm still mystified on how that restriction improves security, however.

iain17 commented 6 years ago

@Voker57 ah thanks for clearing that up. Same here. Do you know by any chance where in the code base this check if its inside of the home directory is done?

Voker57 commented 6 years ago

https://github.com/ipfs/go-ipfs/blob/master/filestore/fsrefstore.go#L213 looks like that's it

Kubuxu commented 6 years ago

It is to limit possible escalation attack. A user might have different permissions than the daemon and escalate the access through the API.

obo20 commented 5 years ago

Adding myself to the list of people who would love this type of capability.

I don't have a solution, but I do have a use case that's essentially the same as @iain17 and it would be great if we could somehow allow the ipfs daemon to share files that aren't directly in the root IPFS repo.

kevina commented 5 years ago

I have not looked at this super closely, however as we send the file contents anyway we can minimize the attack vector given that we (1) Either use the contents sent and not the actual file to generate hash or verify that the file contents match the underlying file, (2) always verify the contents of filestore blocks. I know we do (2) but not sure about (1).

Implementing these 2 checks will make it impossible to gain access to something you don't already have access to. There is a slight chance of a denial of service type attack vector, espacally if the API bound to a non-loopback interface, but I am not sure how this could be exploited.

mitra42 commented 5 years ago

I like @kevina 's proposal, because if the user has access to the file, they could just as easily do ipfs add /foo/bar without --nocopy.

I have a similar situations, I have (multiple) external hard disks that hold a mirror of some portion of the Internet Archive, and want to add their contents with --nocopy, but the .ipfs is in my home directory. I could programmatically add the symlinks, but this is going to a) litter the user's home directory with symlinks which are going to look odd in their UI b) more importantly its going to be hard to do since the locations of those disks is discovered by the server that is adding the files,

dzmitry-lahoda commented 5 years ago

I cannot make it work in Windows 10 with go-ipfs and mklink... So tried with no link, just file. Does not works either.... :((((

  1. download and unarchive go-ipfs
  2. open console and initialize ipfs in data directory.
  3. run and stop daemon.
  4. open 'data/config` in notepad.
  5. Replace "FilestoreEnabled": false with "FilestoreEnabled": true
  6. start daemon again.
  7. put file b.mp4 into same directory as ipfs.exe.

    
    D:\p2p\go-ipfs>dir
    Volume in drive D is work
    Volume Serial Number is 5818-8AF1
    
    Directory of D:\p2p\go-ipfs

03/30/2019 08:06 PM

. 03/30/2019 08:06 PM .. 03/12/2018 03:33 PM 22,239,458 b.mp4 11/02/2018 04:47 AM 0 build-log 03/30/2019 08:06 PM data 11/02/2018 04:47 AM 860 install.sh 11/02/2018 04:47 AM 34,460,160 ipfs.exe 11/02/2018 04:47 AM 1,083 LICENSE 11/02/2018 04:47 AM 467 README.md 6 File(s) 56,702,028 bytes 3 Dir(s) 323,436,937,216 bytes free

8. try to add file into ipfs.
9. put file into `data/b.mp4`
10. does not work

D:\p2p\go-ipfs>ipfs add --nocopy b.mp4 8.50 MiB / 21.21 MiB [==================================>----------------------------------------------------] 40.08%Error: cannot add filestore references outside ipfs root (.)

D:\p2p\go-ipfs>ipfs add --nocopy data/b.mp4 8.50 MiB / 21.21 MiB [==================================>----------------------------------------------------] 40.08%Error: cannot add filestore references outside ipfs root (.)


11. Add file without `--nocopy` all works, filed `copied`.

Links. Symlinks and junctions silently ignored by ipfs.exe. Hardlinks behave same as normal files (and I cannot link from other drive, so will have to map somehow). 

### So IPFS is unusable on Windows 10 for large file collections.

I have my file organization for local files, qbittorrent allows my my file organization for remove files, git with lfs allows my file organization, syncthing allows me my file organization. But not new tools i could use for new web - nor ipfs no zeronet. 

i understand, seems kind of, ipfs may improve storage of file, which are versioned inside it, but that is not my case.
Stebalien commented 5 years ago

Either use the contents sent and not the actual file to generate hash or verify that the file contents match the underlying file, (2) always verify the contents of filestore blocks. I know we do (2) but not sure about (1).

We actually do both of these things.

The only remaining attack vector is confirmation: verifying that some file exists at some path. Unfortunately, that's easily exploitable with guess and check: guess the first byte, guess the second byte, etc.


@dzmitry-lahoda your files need to go in the same directory as your ipfs repo. That's usually means they need to live somewhere in %USERPROFILE%.

pedroapero commented 5 years ago

I moved my repository path to /media for now since all my storage is mounted there. Seems to work so far. You still can't use soft links with it, so keep in mind all is hard-linked from /media/.

sudo mkdir /media/ipfs/
sudo chown user: /media/ipfs/
export IPFS_PATH=/media/ipfs/
ipfs init
ipfs add --nocopy /media/drive1/file1.bin
ipfs add --nocopy /media/drive2/file2.bin

One could also do this from / to allow storing system files. I don't know how mounting works on Windows though…

lordfenixnc commented 5 years ago

im a little confused... i am currently --nocopy files on drive x and my ipfs is on drive e. seems to be working fine but if a open a 2nd CMD i get cannot add filestore refrences outside of ipfs root

jbarthelmes commented 4 years ago

Relevant discussion: https://github.com/ipfs/go-filestore/pull/25

@jbarthelmes:

@Stebalien: If an attacker could access the go-ipfs API, it could use the filestore to read an arbitrary file from the disk.

I still don't see the problem that the deleted 3 lines of code solve.

access the go-ipfs API

Which API do you mean? I'm thinking of these scenarios:

  1. Local privilege escalation: attacker gains same read access as ipfs daemon user. Solution: restrict that user's rights. Access to $HOME is already enforced by the OS.
  2. Remote attacker uses HTTP API to read from file system. Solution: Fix your network security.

Either way, restricting access to $HOME by default does not protect sensitive files like .ssh/id_* or .ipfs/config.

Stebalien commented 4 years ago

Unfortunately, someone could easily misconfigure their go-ipfs node and expose their IPFS API to other origins (e.g., allow the "*" origin). We plan on switching from origin to token/oauth-like security but until then, we're trying to be careful.

Either way, restricting access to $HOME by default does not protect sensitive files like

I agree. That's why this feature is still experimental and off by default. We're not removing it because there are legitimate use-cases and users who rely on it, but I don't want to make the situation any worse.

jbarthelmes commented 4 years ago

I made a quick fork at jbarthelmes/go-ipfs if you're bothered by this issue. It seems to just work with reading/writing absolute paths.

@mrambossek updated just for you

Stebalien commented 4 years ago

Note: If you like the filestore feature and want to make it better, I'd recommend implementing a stand-alone program to serve directories over IPFS. This program would:

  1. Monitor for changes within some target directory.
  2. Add & remove files from an "ipfs" directory when they're added/removed/changed in the target directory.
  3. Publish the root CID using IPNS and/or DNSLink.

This would sidestep the biggest issue with the filestore: if files referenced by the filestore are deleted or modified, go-ipfs will consider the datastore corrupt because a block it thinks should exist is missing.

Voker57 commented 4 years ago

Note: If you like the filestore feature and want to make it better, I'd recommend implementing a stand-alone program to serve directories over IPFS. This program would: 1. Monitor for changes within some target directory. 2. Add & remove files from an "ipfs" directory when they're added/removed/changed in the target directory. 3. Publish the root CID using IPNS and/or DNSLink. This would sidestep the biggest issue with the filestore: if files referenced by the filestore are deleted or modified, go-ipfs will consider the datastore corrupt because a block it thinks should exist is missing.

Sounds like a lot of interfacing effort with unclear benefits for me, why not make IPFS monitor changes itself, and remove missing blocks automatically?

Stebalien commented 4 years ago

It doesn't interact well with go-ipfs pins/deduplication. You could later add other files that share blocks with these filestore files. If you then remove the filestore files, you'd lose the new files as well (or at least lose pieces of them).

It's possible to work around this but it's non-trivial.

AzureCerulean commented 4 years ago

@djdv ...> I made a shell extension for Windows that makes a symlink of the target's parent folder and this is the primary method I use for now.

I tried to access this, is it still available?

Vort commented 3 years ago

Symlinks are nothing better than making one more copy of the files: both are ugly hacks. Security problems are solved somehow by DC++, torrents etc. So it should be possible for IPFS too.

But for now IPFS is, sadly, unusable for me :( (I wanted to add already seeded via torrent Libgen files (instruction), but looks like it is impossible without hacks right now)

Also if I tell software to share single folder, then I expect that it will do exactly this thing. Not any file on HDD, not different folders etc. If it is so hard for developers to implement, than it is a larger problem, than symlinks alone.

jsarenik commented 3 years ago

Adding myself to the list of people who use --nocopy on latest ipfs version just because according to https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#road-to-being-a-real-feature-2 this feature needs more people to use and report how well it works. It works well for me. Linux x86_64 glibc (Ubuntu 20.04). Thanks!

kovan commented 3 years ago

Does this even work when IPFS_PATH is set? I keep getting Error: cannot add filestore references outside ipfs root (.), no matter how I place the directories.

whoizit commented 3 years ago

I use bind to get around the problem outside ipfs root in /etc/fstab:

# example, if .ipfs dir in /home/user
/mnt/share /home/user/share none bind