microlinkhq / open

4 stars 2 forks source link

Microlink Query Language #5

Open Kikobeats opened 5 years ago

Kikobeats commented 5 years ago

Specification

Feature Name (Need to determinate)

Feature Headline

Turns any website into your API.

Features

Developer Experience

Batching support

via dataloader.

const data = await microlink(url) // returns single object
const data = await microlink([url]) // returns a collection
const data = await microlink([url, url, url]) // returns a collection

Caching Support

via got#cache.

const data = await microlink([url, url, url]) // first time fetch content from server
const data = await microlink([url, url, url]) // successive calls get content from local cache

Selector Declaration

// Simple request with API Parameters
const { status, data, message } = await microlink('https://example.com', { palette: true })

// declaring rules (see https://microlink.io/blog/custom-rules)
const client = microlink.extend({
  rules: {
    title: {
      selector: 'h1',
      attr: 'text', // `text` is default
      type: 'title',
      default: 'Hello World'
    }
  }
})

// multiple rules as once
const client = microlink.extend({
  rules: {
    title: {
      selectors: [{ selector: 'h1', attr: 'text' }, { selector: 'h1 > .title', attr: 'text' }],
      type: 'title',
      default: 'Hello World'
    }
  }
})

// using custom type
const truncatedTitle = value => value.substring(0,10)

const client = microlink.extend({
  rules: {
    title: {
      selector: 'h1',
      type: truncatedTitle
    }
  }
})

// custom rule is apply to any petition using `client`
const { status, data, message } =  await client('https://example.com')

Data Types

Metascraper types

Native JS types

Others

Also, add the ability to easily add new types (on client side):

{
  selector: 'h1',
  type: value => value.trim()
}

Consider if it could be possible to load external dependencies, similar how deno imports from URLs:

Launch Day

Landing Page

Write a special section on the website to show the functionality.

Complementary or inside the section in the website write a little documentation about how to use it.

Recipes (https://microlink.io/recipes)

Write a series of recipes to show how to connect the functionality with a set of popular services for extracting specific content (followers, followings, stars, etc)

3rd Party Apps

Bubble

screen_shot_2018-10-22_at_11 23 20

Review how we can leverage the functionality with third party tools, like Bubble.

Inspiration

charsleysa commented 5 years ago

Would this replace the existing custom rules implementation or would it extend the implementation with a client side capability as well?

Kikobeats commented 5 years ago

It will be use Custom Rules behind the hood

charsleysa commented 5 years ago

I'm assuming if we want multiple rules for the same rule name we just specify an array. (e.g. title: [ $('h1'), $('h2') ])

How to we specify the attr and type for rule?

Kikobeats commented 5 years ago

I moved the future html parameter out of the scope for now to stay focus in what we can build for today.

I updated the example with some features, any suggestion?

charsleysa commented 5 years ago

For specifying multiple rules (and extra settings), would we do the following?

// custom rules (see https://microlink.io/blog/custom-rules)
const client = microlink.extend({
  rules: {
    title: [
        {
          selector: 'h1 > .title',
          attr: '<what do we put here for DOM Element content?>',
          type: 'title'
        },
        {
          selector: 'h1'
        }
    ]
  }
})
Kikobeats commented 5 years ago

Add documentation section

Kikobeats commented 5 years ago

Another example of client code used for merging more than one petition:


'use strict'

const { set, reduce, map } = require('lodash')
const { URL } = require('url')
const qsm = require('qsm')
const got = require('got')

const getMeta = async (url, { apiKey, ...opts } = {}) => {
  const { origin: originUrl } = new URL(url)
  const endpoint = apiKey
    ? 'https://pro.microlink.io'
    : 'https://api.microlink.io'

  const gotOpts = { json: true }
  if (apiKey) set(gotOpts, 'headers.x-api-key', apiKey)

  const res = await Promise.all([
    got(qsm.add(endpoint, { url: originUrl, ...opts }), gotOpts),
    got(qsm.add(endpoint, { url, ...opts }), gotOpts)
  ])

  const data = map(res, 'body.data')
  const meta = reduce(data, (acc, data) => ({ ...acc, ...data }), {})
  return meta
}

getMeta(process.argv[2])
  .then(meta => {
    console.log('meta', meta)
    process.exit()
  })
  .catch(err => {
    console.error(err)
    process.exit(1)
  })
Kikobeats commented 5 years ago

@charsleysa About how to specify multiple rules, what do you think about this proprosal:

const client = microlink.extend({
  rules: {
    title: {
      selectors: [{ selector: 'h1', attr: 'text' }, { selector: 'h1 > .title', attr: 'text' }],
      type: 'title',
      default: 'Hello World'
    }
  }
})
charsleysa commented 5 years ago

@kikobeats I think that would work great!

It's similar to how we have structured a table in our DB storing custom rules (though in our system we transparently change the name if it clashes with an existing property as existing properties don't play nice with defaults).

Kikobeats commented 4 years ago

Updated reflecting types already implemented

Kikobeats commented 4 years ago
var isIP = require('net').isIP;

Interesting! builtin types 😄