textury / ardb

Interact with Arweave like if it was any other database. With typed definitions and without having to write a GQL story to retrieve data!
MIT License
54 stars 7 forks source link

Feature: `.fields()` only returns specific fields #3

Closed fabianriewe closed 3 years ago

fabianriewe commented 3 years ago

Hi! I was thinking about adding a .fields() option, which only returns the fields in the parameter. This could make queries more performant. Happy to discuss that!

cedriking commented 3 years ago

Hey @fabianriewe interesting! does it really helps with the result's performance? If that's the case, then we should definitely add this!

fabianriewe commented 3 years ago

Hi @cedriking, I created a test script and got the following results. It seems like that there is a performance difference.

argql: 12.857s
argql-full: 33.295s
ardb: 38.984s

Script I ran:

import {run} from "ar-gql"
import Arweave from "arweave";
import ArDB from "ardb";

const client = new Arweave({
  host: 'arweave.net',// Hostname or IP address for a Arweave host
  port: 443,          // Port
  protocol: 'https',  // Network protocol http or https
  timeout: 20000,     // Network request timeouts in milliseconds
  logging: false,     // Enable network request logging
});

const ardb = new ArDB(client);

const main = async () => {
  console.log("Starting....")
  console.time('argql')
  for (let i = 0; i < 50; i++) {
    const result = await run(`
  query($cursor: String) {
  transactions(
    tags: [{ name: "Application", values: ["ArVerify"] }]
    first: 100
    after: $cursor
  ) {
    pageInfo {
      hasNextPage
    }
    edges {
      cursor
      node {
        id
      }
    }
  }
}`)
  }
  console.timeEnd("argql")
  console.time('argql-full')
  for (let i = 0; i < 50; i++) {
    const result = await run(`
  query($cursor: String) {
  transactions(
    tags: [{ name: "Application", values: ["ArVerify"] }]
    first: 100
    after: $cursor
  ) {
    pageInfo {
            hasNextPage
          }
          edges { 
            cursor
            node { 
              id
              anchor
              signature
              recipient
              owner {
                address
                key
              }
              fee {
                winston
                ar
              }
              quantity {
                winston
                ar
              }
              data {
                size
                type
              }
              tags {
                name
                value
              }
              block {
                id
                timestamp
                height
                previous
              }
              parent {
                id
              }
            } 
          }
  }
}`)
  }
  console.timeEnd("argql-full")
  console.time("ardb")
  for (let i = 0; i < 50; i++) {
    const result = await ardb.search('transactions').tag("Application", "ArVerify").limit(100).find()
  }
  console.timeEnd("ardb")

}

main();
fabianriewe commented 3 years ago

I have also modified the ArDB code to fetch the same as in the first ar-gql query, and it was faster than the full ar-gql query. But it’s fair to say that the results are varying a lot.

argql: 14.102s
argql-full: 34.855s
ardb: 25.910s // -> sometimes as low as 16, sometimes around 30. Always lower than arql-full.
cedriking commented 3 years ago

Really interesting @fabianriewe ! Do you think fields is the right name for this method? 🤔 I was wondering if maybe returns() makes more sense, or something else.

Fields sounds a bit confusing between the fields it returns and the fields they are sending. Do you have any other words in mind?

fabianriewe commented 3 years ago

I am very used to MongoEngine (a python & MongoDB ODM). They call it .only() and .exclude(). That could be some nice naming. What do you think?

https://mongoengine-odm.readthedocs.io/guide/querying.html?#retrieving-a-subset-of-fields

Also, it would be really cool if you can set these as default when initiating the class. Most of the time, I only use ID, Tags, Blocks, Owner, Reciever, etc. for all my queries

cedriking commented 3 years ago

Thank's @fabianriewe for this info, only/exclude is the best approach. I'm also a big fan of MongoEngine and tried to name most of it similar to how it works over there.

I didn't set it as default on the constructor since I don't think that makes much sense based on how everything is structured, but we have everything by using only, exclude, we can even include or exclude only a few set of children of children.

Available from version 1.0.5