steffenfritz / FileTrove

FileTrove indexes files and creates metadata from them.
https://filetrove.fritz.wtf
GNU Affero General Public License v3.0
35 stars 6 forks source link

[CHANGE] report on existence of xattrs #110

Open ross-spencer opened 3 weeks ago

ross-spencer commented 3 weeks ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Report on xattrs associated with files.

Describe the solution you'd like A clear and concise description of what you want to happen.

xattrs might exist for file objects on file systems that support them and they might contain useful metadata, see related projects I have been following (cc. @pjotrek-b):

^^ nb. your involvement would be awesome too Steffen.

But also, they may also exist for steganographic purposes or the hiding of information, see NTFS: https://blog.netwrix.com/2022/12/16/alternate_data_stream/.

In the first instance, a flag that streams exist would be great. In a further output from a tool (maybe FileTrove) listing those for archival objects would also be interesting.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

No alternatives, but there is a catch here in that Nix like systems seem to behave similarly to each other, but NTFS alternate streams are more arbitrary in nature, e.g. they don't have (I believe) namespaces.

Example code using: https://pkg.go.dev/github.com/pkg/xattr#section-readme

package main

import (
    "log"
    "strings"

    "github.com/pkg/xattr"
)

func main() {
    log.Println("checking xattrs")

    const path = "/tmp/mercs/mercs_sample.file"

    var list []string
    var err error
    if list, err = xattr.List(path); err != nil {
        log.Fatal(err)
    }

    if len(list) <= 0 {
        return
    }

    log.Println("flag: there are xattrs here")

    for _, v := range list {

        if strings.HasPrefix(v, "user") {
            log.Println("user space key:", v)

        }
        if strings.HasPrefix(v, "trusted") {
            // trusted prefix
        }
        if strings.HasPrefix(v, "security") {
            // security prefix
        }
        if strings.HasPrefix(v, "system") {
            // system prefix
        }
    }
}

And a script to output a sample file with xattrs:

#! /usr/bin/bash

set -e

FILENAME=mercs_sample.file

function create() {
 touch "$FILENAME"
 echo "mercs" > "$FILENAME"
 setfattr -n user.mercs.name -v mercs_sample.file "$FILENAME"
 setfattr -n user.mercs.date -v "$(date +'%Y-%m-%dT%H:%M:%S%z')" "$FILENAME"
 setfattr -n user.mercs.checksum -v $(md5sum mercs_sample.file) "$FILENAME"
 setfattr -n user.mercs.mimetype -v "$(file --mime-type $FILENAME)" "$FILENAME"
 setfattr -n user.mercs.encoding -v "$(file --mime-encoding $FILENAME)" "$FILENAME"
 setfattr -n user.mercs.size -v "$(du -h $FILENAME | cut -f -1)" "$FILENAME"
}

function get() {
 echo "==============="
 echo "mercs dumpattrs"
 echo "==============="
 echo
 getfattr -d $FILENAME
}

create
get

Script output:

2024/10/25 09:16:05 checking xattrs
2024/10/25 09:16:05 flag: there are xattrs here
2024/10/25 09:16:05 user space key: user.mercs.name
2024/10/25 09:16:05 user space key: user.mercs.size
2024/10/25 09:16:05 user space key: user.mercs.date
2024/10/25 09:16:05 user space key: user.mercs.checksum
2024/10/25 09:16:05 user space key: user.mercs.encoding
2024/10/25 09:16:05 user space key: user.mercs.mimetype
steffenfritz commented 3 weeks ago

Hi Ross, this would be an incredible feature and I will put it on the roadmap!

Thanks for the input and code! There will be a branch.

steffenfritz commented 3 days ago

I added first code to the branch. But as expected and you indicated that, there are some problems we face checking for ADS. When you mount a NTFS from Linux, then it depends on the driver/module how this is handled. ":" is handled as part of the filename, so a driver in Linux might just drop the stream. Or it does not show the stream, then you have to "guess" the stream name (which makes no sense). Same for handling xattrs, mounted on Windows. So there are some hurdles :)