ipfs / ipfs-webui

A frontend for an IPFS Kubo and IPFS Desktop
https://webui.ipfs.io
MIT License
1.56k stars 490 forks source link

feat: handle non-local files with grace #2135

Open SgtPooki opened 1 year ago

SgtPooki commented 1 year ago

This task aims to resolve all of the issues in the Related items section below. The tasklist clarifies how we plan to accomplish this.

### Tasks
- [ ] feat(files): users know whether files are local or not before they click them
- [ ] feat(files): when offline, files/folders are not clickable if the user would not be able to view the content/children
- [ ] feat(files): users can replicate files locally with an action
- [ ] https://github.com/ipfs/ipfs-webui/issues/1248
- [ ] https://github.com/ipfs/ipfs-webui/issues/1691

Some details about a method forward for this at https://github.com/ipfs/ipfs-desktop/issues/2492#issuecomment-1620846856

Related items

acul71 commented 2 months ago

@SgtPooki @kt-wawro Planning to work on this after closing https://github.com/ipfs/ipfs-webui/pull/2255

acul71 commented 1 month ago

@SgtPooki @lidel
About:

To determine if a file is local, I'm using the following commands:

ipfs block stat --offline CID   # ipfs.block.stat(cid, { offline: true })

I’m retrieving all blocks for the current directory and checking if they are local or not by using:

ipfs dag get CID   # ipfs.dag.get(f.cid)

I modified the dirStats function in actions.js. Initially, I retrieved all the links for the listed files and checked if they were local. However, this turned out to be time-consuming. So, empirically, I now check only the first link and assume:

/**
 * @param {IPFSService} ipfs
 * @param {CID} cid
 * @param {Object} options
 * @param {string} options.path
 * @param {boolean} [options.isRoot]
 * @param {import('./utils').Sorting} options.sorting
 */
const dirStats = async (ipfs, cid, { path, isRoot, sorting }) => {
  const entries = await all(ipfs.ls(cid)) || []
  // Workarounds regression in IPFS HTTP Client that causes
  // ls on empty dir to return list with that dir only.
  // @see https://github.com/ipfs/js-ipfs/issues/3566
  const res = (entries.length === 1 && entries[0].cid.toString() === cid.toString())
    ? []
    : entries
  const files = []

  // precaution: there was a historical performance issue when too many dirs were present
  let dirCount = 0

  for (const f of res) {
    const absPath = join(path, f.name)
    let file = null

    if (dirCount < 1000 && (f.type === 'directory' || f.type === 'dir')) {
      dirCount += 1
      file = fileFromStats({ ...await stat(ipfs, f.cid), path: absPath })
    } else {
      file = fileFromStats({ ...f, path: absPath })
    }

    // Check if file is local
    const refs = await ipfs.dag.get(f.cid)
    // Extracting all CIDs
    const refsCids = refs.value.Links.map(link => link.Hash['/'])
    // console.log(refsCids)
    // console.log(`refs=${JSON.stringify(refs, null, 2)}`)
    if (dirCount > 300) {
      file.isLocal = undefined
    } else
    if (refsCids.length === 0) {
      file.isLocal = true
    } else {
      file.isLocal = true
      // for (const myCid of refsCids) {
      const myCid = refsCids[0]
      const cid = CID.decode(myCid)
      console.log(cid)
      try {
        // console.log(`myCid=${myCid} cid=${cid.toString()}`)
        // const res = await ipfs.block.stat(cid, { withLocal: true, offline: true })
        const res = await ipfs.block.stat(cid, { offline: true })
        console.log(`myCid=${cid}, res=${JSON.stringify(res, null, 2)}`)
        // break
      } catch (error) {
        file.isLocal = false
        console.error(`Error fetching stats for cid=${cid}:`, error)
        // break
      }
    }
    files.push(file)
  }

The UI looks like this:

image

PS: Try the Project Apollo Archives and XKCD archives Thoughts?

acul71 commented 1 month ago

@SgtPooki @lidel About

I wanted to check whether the browser is offline or online in JavaScript by using the navigator.onLine property, which returns a boolean value How ever not all browser are compliant like safari on mac. So I after browsing for info on forums, I thought to add an additional check, let's say once every 60 secs getting google icon to see if we're really on line.

const [isOnline, setIsOnline] = useState(navigator.onLine)

  const CHECK_INTERNET_STATUS = 60 * 1000

  function isGoogleOnline () {
    return new Promise((resolve) => {
      const img = document.createElement('img')
      img.onerror = () => resolve(false)
      img.onload = () => resolve(true)
      img.src = 'https://www.google.com/favicon.ico?_=' + new Date().getTime()
    })
  }

  // Function to create a delay
  const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms))

  // Function to check status and log the result
  async function checkInternetStatus () {
    let onlineStatus = navigator.onLine
    if (onlineStatus) {
      console.log('You seems online! Checking Google...')
      // Wait for 3 seconds
      await delay(3000)
      const googleOnline = await isGoogleOnline()
      console.log('Google is', googleOnline ? 'online' : 'offline')
      onlineStatus = googleOnline
    } else {
      console.log('No internet connection.')
    }
    setIsOnline(onlineStatus)
  }

  // Use useEffect to run the check once on component mount and set up an interval
  useEffect(() => {
    // Event listeners for online and offline status
    const handleOnline = () => checkInternetStatus()
    const handleOffline = () => setIsOnline(false)

    window.addEventListener('online', handleOnline)
    window.addEventListener('offline', handleOffline)

    // Check status immediately on page load
    checkInternetStatus()

    // Set up the interval to check status every 15 seconds
    const intervalId = setInterval(checkInternetStatus, CHECK_INTERNET_STATUS)

    // Clean up the interval and event listeners when the component unmounts
    return () => {
      clearInterval(intervalId)
      window.removeEventListener('online', handleOnline)
      window.removeEventListener('offline', handleOffline)
    }
  }, []) // Empty dependency array to run only once on mount

If we're offline I'll apply this class to shade/unclick the file view

.greyed-out {
  color: grey;
  pointer-events: none; /* This will prevent clicking */
  opacity: 0.5;
}

Should I choose only to use navigator.onLine property or use also the get image workaround, but I'm afraid in some cases it could fail even if we're not offline (could it happen a captcha ?) and shade the file content

Thoughts?

SgtPooki commented 1 month ago

hey @acul71, thanks for the ping. We shouldn't include a dependency on a service like google.. I think a simple check for a 'hello world' CID should be sufficient for a backup online check? such as a fetch for bafkqae2xmvwgg33nmuqhi3zajfiemuzahiwss or similar.

If we query the local kubo node it could resolve immediately (due to being an inline CID) even if we're not actually online, so i'm thinking maybe a light fetch request to trustless-gateway.link?

image
const url = 'https://trustless-gateway.link/ipfs/bafkqae2xmvwgg33nmuqhi3zajfiemuzahiwss?format=raw';

async function headRequest(url) {
  try {
    const response = await fetch(url, {
      method: 'HEAD',
    });

    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }

    // Log status code
    console.log('Status Code:', response.status);
  } catch (error) {
    console.error('Error making HEAD request:', error);
  }
}

headRequest(url);
acul71 commented 1 month ago

@SgtPooki @lidel About

I'm planning to modify doFilesFetch that handles the procedure. After the command const cid = await ipfs.files.cp(srcPath, dst) I'm doing ipfs.refs(srcPath, { recursive: true }) to actually get all the blocks. Is this the way or ? I saw that for large content It can take a long time to complete and no visual status (except console messages for the moment) is given

 /**
   * Adds file under the `src` path to the given `root` path. On completion will
   * trigger `doFilesFetch` to update the state.
   * @param {string} root
   * @param {string} src
   * @param {string} name
   */
  doFilesAddPath: (root, src, name = '', downloadLocally = false) => perform(ACTIONS.ADD_BY_PATH, async (ipfs, { store }) => {
    ensureMFS(store)

    const path = realMfsPath(src)
    const cid = /** @type {string} */(path.split('/').pop())

    if (!name) {
      name = cid
    }

    const dst = realMfsPath(join(root, name))
    const srcPath = src.startsWith('/') ? src : `/ipfs/${cid}`

    try {
      console.log(`Adding ipfs.files.cp(${srcPath}, ${dst}) downloadLocally=${downloadLocally}`)
      const cid = await ipfs.files.cp(srcPath, dst)
      console.log(`Adding ipfs.files.cp(${srcPath}, ${dst}) = ${cid}`)

      // ipfs refs --recursive --help
      if (downloadLocally === true) {
        for await (const ref of ipfs.refs(srcPath, { recursive: true })) {
          if (ref.err) {
            console.error('refs: ', ref.err)
          } else {
            console.log('refs: ', ref)
            // output: "QmHash"
          }
        }
      }
      return cid
    } finally {
      await store.doFilesFetch()
    }
  }),
acul71 commented 1 month ago

@SgtPooki @lidel About:

  • feat(files): Users can now see whether files are local or remote before clicking them.

To determine if a file is local, I'm using the following commands:

ipfs block stat --offline CID   # ipfs.block.stat(cid, { offline: true })

I’m retrieving all blocks for the current directory and checking if they are local or not by using:

ipfs dag get CID   # ipfs.dag.get(f.cid)

I modified the dirStats function in actions.js. Initially, I retrieved all the links for the listed files and checked if they were local. However, this turned out to be time-consuming. So, empirically, I now check only the first link and assume:

  • If one link CID is found to be local, the file is marked as local.
  • If one link CID is offline, the file is marked as remote.
  • If the directory list exceeds 300 entries, I stop checking and mark it as undefined.

..........

lidel (from Slack) Monday at 4:58 PM i think heuristic and proposed UX sounds sensible, you may run into trouble with big HAMT-sharded directories but people dont open such big directories with webui anyway. something you may explore is checking directory in a streaming fashion, or maybe even improving kubo RPC. what you can do today in CLI (i dont remember if kubo-rpc-client in JS supports this in streaming fashion): ipfs ls --stream --size=false --resolve-type=false --offline /ipns/en.wikipedia-on-ipfs.org/wiki/ will list content of directory without fetching root blocks of children (and you will get Error if you dont have enough blocks to enumerate directory ipfs ls --stream --size=true --resolve-type=true --offline /ipns/en.wikipedia-on-ipfs.org/wiki/ will also check size and type of every child, triggering its fetch, so if any of children is not local, you will get error like Error: block was not found locally (offline): ipld: could not find bafybeihn2f7lhumh4grizksi2fl233cyszqadkn424ptjajfenykpsaiw4 What we could add to Kubo RPC, is --is-local=true|false which would return additional field for every item in directory informing you if the root is local or not. This way we could do ipfs ls --stream --size=false --resolve-type=false --is-local=true /ipns/en.wikipedia-on-ipfs.org/wiki/ (using new flag instead of --offline and fetching all roots ) and this way it would not error on first missing block, but enumerate directory without doing extra work.

React

4:59 anyway, maybe open a PR or issue and we can discuss there? feels like useful feature, but we also should not add extra 1-2 requests per file 5:00 in my mind, locality check is a niche feature, most of users will have data locally, no need for them to pay penalty of those additional checks , so better to add locality status to RPC, just like we get file size i guess?

##############################

@lidel @SgtPooki I've been trying to profile the added code I find out that this is the time added by the additional check:

time_X                  43 files    0.152 seconds
time_Apollo_albums.txt  105 files   0.422 seconds
time_XKCD_Archive.txt   ~1800 files  1.429 seconds

Maybe what I can do is submit this solution in the PR and open an issue or feat request in kubo, to let this info (local or not) come back with ipfs.files.stat. It seems to me reasonable to wait some fraction of a second in average to get the info. And then when the kubo feat will be implemented, change the code accordingly. Let me know, what to do..

/**
 * @param {IPFSService} ipfs
 * @param {CID} cid
 * @param {Object} options
 * @param {string} options.path
 * @param {boolean} [options.isRoot]
 * @param {import('./utils').Sorting} options.sorting
 */
const dirStats = async (ipfs, cid, { path, isRoot, sorting }) => {
  const entries = await all(ipfs.ls(cid)) || []
  // Workarounds regression in IPFS HTTP Client that causes
  // ls on empty dir to return list with that dir only.
  // @see https://github.com/ipfs/js-ipfs/issues/3566
  const res = (entries.length === 1 && entries[0].cid.toString() === cid.toString())
    ? []
    : entries
  const files = []

  // precaution: there was a historical performance issue when too many dirs were present
  let dirCount = 0
  // let fileCount = 0

  let totDuration = 0
  for (const f of res) {
    const absPath = join(path, f.name)
    let file = null

    if (dirCount < 1000 && (f.type === 'directory' || f.type === 'dir')) {
      dirCount += 1
      file = fileFromStats({ ...await stat(ipfs, f.cid), path: absPath })
    } else {
      file = fileFromStats({ ...f, path: absPath })
    }

    // Start timing
    const startTime = performance.now()
    // If dirCount > 300 stop checking and mark isLocal as undefined
    if (dirCount > 300) {
    // if (fileCount > 300) {
      file.isLocal = undefined
    } else {
      // fileCount += 1
      // Check if file is local
      // TODO: See if # ipfs ls --size=false --resolve-type=false CID is faster then ipfs.dag.get CID
      const refs = await ipfs.dag.get(f.cid)
      // Extracting all CIDs
      // const refsCids = refs.value.Links.map(link => link.Hash['/'])
      const firstRefCid = refs.value.Links.length > 0 ? refs.value.Links[0].Hash['/'] : ''

      // console.log(refsCids)
      // console.log(`refs=${JSON.stringify(refs, null, 2)}`)
      // if (refsCids.length === 0) {
      if (firstRefCid === '') {
        file.isLocal = true
      } else {
        file.isLocal = true
        // for (const myCid of refsCids) {
        // const myCid = refsCids[0]
        // const cid = CID.decode(myCid)
        const cid = CID.decode(firstRefCid)
        // console.log(cid) // debug lucas
        try {
          // console.log(`myCid=${myCid} cid=${cid.toString()}`)
          // const res = await ipfs.block.stat(cid, { offline: true })
          await ipfs.block.stat(cid, { offline: true })
          // console.log(`myCid=${cid}, res=${JSON.stringify(res, null, 2)}`)
          // break
        } catch (error) {
          file.isLocal = false
          console.error(`Error fetching stats for cid=${cid}:`, error)
          // break
        }
      }
    }
    // End timing and log the duration
    const endTime = performance.now()
    // const duration = (endTime - startTime) / 1000 // Convert to seconds
    const duration = (endTime - startTime)
    totDuration += duration

    files.push(file)
  }
  console.log(`Time taken to check if file is local: ${totDuration / 1000} secs`)

  let parent = null

  if (!isRoot) {
    const parentPath = dirname(path)
    const parentInfo = infoFromPath(parentPath, false)

    if (parentInfo && (parentInfo.isMfs || !parentInfo.isRoot)) {
      const realPath = parentInfo.realPath

      if (realPath && realPath.startsWith('/ipns')) {
        parentInfo.realPath = await last(ipfs.name.resolve(parentInfo.realPath))
      }

      parent = fileFromStats({
        ...await stat(ipfs, parentInfo.realPath),
        path: parentInfo.path,
        name: '..',
        isParent: true
      })
    }
  }

  return {
    path,
    fetched: Date.now(),
    type: 'directory',
    cid,
    upper: parent,
    content: sortFiles(files, sorting)
  }
}
acul71 commented 1 month ago

@lidel @SgtPooki About:

image

What is the exact meaning of files, pins, blocks, local repo ?

Right know It's like this: image

I assume that:

About the expandable thingy (▶) About this numbers: Right now there is a tool-tip when you do a mouse over the label: image Should I have to keep this approach (info on mouse over) and/or add the expandable thingy (▶) ?

SgtPooki commented 1 week ago

What is the exact meaning of files, pins, blocks, local repo ?

Name Old name Meaning
files ~all blocks~ files The size of all the files including local and remote
pins n/a The number of items pinned
blocks n/a The number of total blocks in the MFS store
local repo ~files~ all blocks The size of all blocks actually stored in your kubo blockstore

Should I have to keep this approach (info on mouse over) and/or add the expandable thingy (▶) ?

I believe we want to add the expandable section so we can provide more information. It may not look great when expanded though, so we might need to pivot. However, let's start there

lidel commented 1 week ago

What is the exact meaning of files, pins, blocks, local repo ?

@acul71 the issue https://github.com/ipfs/ipfs-webui/issues/1248 you linked is from 2019 and is how old UI looked like. It was simplified since then (removed noise like pins and blocks).

I wrote down my understanding of the current state of that header (and the sources of values) in https://github.com/ipfs/ipfs-webui/issues/1248#issuecomment-2477025399.