serapath / task-messenger

task messenger
MIT License
0 stars 1 forks source link

Create - chat_input component #5

Open alyhxn opened 6 months ago

alyhxn commented 6 months ago

Todo

alyhxn commented 6 months ago

tasks - 2024.04.28

worklog

worklog-206 worklog-207 *At last, I meant the code is hard to understand by someone new I don't know if I explained well.

serapath commented 6 months ago

feedback 2024.04.28 (1/3)

alright. looks good.

few things i noticed in the communication, that should change.

So the idea is the following:

  1. when a invite/join is established
    • (= a handshake between bob and ana for example)
    • it is always for a specific task for now.
  2. every task has a "chat room", which has chat logs (e.g. think arrays)
  3. one array for each user in the chat log for each user
    • if bob invites ana to task 273, then:
      • bob's messenger creates:
      • TASKS[273].room = { '': [m0], [ana_ID]: [] }
      • ana's messenger creates:
      • TASKS[273].room = { '': [], [bob_ID: [m0] }
  4. m0 is message0 when bob creates task 273 and contains infos
    • e.g. m0.data contains the arguments passed to creating task 273 initially
  5. every message appended to a chat log is json:
    • e.g. { head, refs, type, data, meta }
    • for now we can just use what we need, e.g. head, type, data
  6. every time a message is appended to an array, all users are informed
    • Basically, when bob or ana initially meet or "shake hands"
    • because there is an invite and join,
    • there needs to be a channel established, e.g.
      user.id // could for now be: `${Math.random()}`.slice(2)
      user.mid = 0 // initial value
      user.name = 'bob' // or 'ana'
      user.send = msg => {}
  7. once e.g. bob receives a new message from ana, bob will append it to:
    • the array that represents ana's messages for the room, e.g.:
      ana.on = msg => {
      TASKS[273].room[ana_ID].push(msg)
      }
  8. this is a deterministically eventually growing data structure.
  9. the "chat room view" in the DOM is just an index of the individual chat logs

RULES:

  1. every message has a date, e.g. message.meta.data = Date.now()
    • you can make it more human readable via new Date(message.meta.data).toString()
  2. every message can point to messages of others
    • message.refs[name] = head

The name here is custom for us to choose, but message.refs basically let's us say: => Some messages are previous message which came BEFORE the current message.

This way, we will have for each user (bob, ana, ...), a data structure, where for each task room there is an associated room, which has one chat log per user so bob has: task123.room = { '': [], [ana_ID]: [], [eve_ID]: [], [ali_ID]: [], ... } and ana has: task123.room = { '': [], [bob_ID]: [], [eve_ID]: [], [ali_ID]: [], ... } and eve has: task123.room = { '': [], [ana_ID]: [], [bob_ID]: [], [ali_ID]: [], ... } and ali has: task123.room = { '': [], [ana_ID]: [], [bob_ID]: [], [eve_ID]: [], ... } The '' array is the log writable by the author user The id named arrays is the log read only by each author The logs will always have the same order for everyone, because they are append only Additionally, some messages in some logs will have message.refs to point to previous messages in the same log or messages of other logs FOR THE SAME ROOM.

The visible chat log view in the DOM needs to use that data structure to ideally create a deterministic view for everyone.

Now of course, how a specific kind of message is "visualized" in the chat view, can be custom, the way you make it more of a sentence, etc..., that's ok.

It might be good to be able to e.g. "highlight" other messages in the log, if they were contained by a message via "message.refs" heads. Think of normal messenger when you "reply" to a message. Or maybe it just highlights previous messages when you click a specific message Or maybe you can press a button to "collapse" all message which are not in the "message.refs" chain of a particular message (directly or indirectly).

Of course, a single message can contain more than one past message through message.refs[name] and hence those messages can also have further message.refs, so this is a tree structure of earlier message and it constitutes a topological ordering of message. For message which do not have a relation through message.refs, they are logically "parallel messages" and it depends on the order they are received. But we still can utilize message.meta.date to decide on the order of those messages.

If the order indicated by message.meta.date conflicts with what is indicates by the topological order of message.refs, then message.refs ALWAYS has precedence and a conflict should be indicated by a color in the "chat view" of that message. Technically a conflict like that should never happen in our internal demo :-) We don't plan to write code to create "fake/malicious" timestamps for now :D

serapath commented 6 months ago

feedback 2024.04.29 (2/3)

A general convention for how to use html and sub components would be to create either a "list container" to append a list of sub component instances or a "named element container" to append a singleton instance of a sub component.

It would be great if we could bit by bit really strictly follow that convention. Components will be re-used in different contexts and one big reason why we write components the way we do is that they have some sort of membrane around themselves to protect others and themselves from accidentally or maliciously interfering with each other in non-consent ways :-)


example

An example of a list container would be the .history div, which will contain a growing list of elements.

An example of a singleton element container would be some element , maybe .input for the input field sub component.

Currently we just append the input sub component instance to the .chat element, which of course, puts it in the right position, but there are downsides:

  1. looking at the HTML snippet doesnt really reveal there will be an input element and in what position it will be
  2. the current (container) element cant protect itself from malicious sub component instance code

https://github.com/serapath/task-messenger/blob/5c5617436f1307623252561daae124d7ffaa30b6/src/index.js#L46-L79

What does (2.) mean? The sub component can return something like:

function inputfield (...) {
  // ...
  const el = document.createElement('div')
  const sh = el.attachShadow({ mode: 'closed' })
  setTimeout(foo, 1000) // @TODO: will explain below
  // ...
  return el
}

The container component can

const inputfield = require('input-field')
function chat (...) {
  const el = document.createElement('div')
  el.innerHTML = `<div class="chat">
    <header> bob </header>
    <div class="history"></div>
  </div>`
  el.append(inputfield())
  return el
}

now what if:

function foo () { // code used in the first sub component
  const [header, history] = el.parent.children
  header.innerHTML = 'evil-hacker'
  history.innerHTML = 'you have been hacked!'
}

How can this be prevented? There are many parts to making things actually secure, but in order to even apply those, we have to write components that enable the basics. The change would need to happen in the container component and could look like this:

const inputfield = require('input-field')
function chat (...) {
  const el = document.createElement('div')
  el.innerHTML = `<div class="chat">
    <header> bob </header>
    <div class="history"></div>
    <div class="input"></div>
  </div>`
  const shopts = { mode: 'closed' }
  const input = el.querySelector('input').attachShadow(shopts)
  input.append(inputfield())
  return el
}

This simple change basically will not allow the sub component instance to get access to header or history, because it only has access to the content inside the empty shadow of the designated .input element shadow it gets appended to.


instantiating sub components

i see the following code in your code (simplified):

async function task_messenger (opts, protocol) {
  // ...
  { // task explorer
    // ...
    const element = task_explorer(opts = { users, host: username }, protocol)
    // ...
  }
  { // chat input
    // ...
    const element = await chat_input(opts = { users, host: username }, protocol)
    // ...
  }
}

Yes, the above technically works. You receive opts in task_messenger(...) you overwrite opts in task_explorer(...) you overwrite it again in chat_input(...) I think people don't expect this and maybe this was just an accident, but preferrably you would change that everywhere and in general in all code to:

async function task_messenger (opts, protocol) {
  // ...
  { // task explorer
    // ...
    const opts = { users, host: username }
    const element = task_explorer(opts, protocol)
    // ...
  }
  { // chat input
    // ...
    const opts = { users, host: username }
    const element = await chat_input(opts, protocol)
    // ...
  }
}

This basically does the exact same thing, but now javascript creates a nested opts variable inside the BLOCK SCOPE of task explorer instantiation and also chat input instantiation, where the locally defined opts variable shadows the outer scope opts reference, which is defined in function task_messenger (opts, ...) {. That way it is way more clear and ...if by any change the task messenger would use opts later on, it is not accidentally overwritten by the params to chat_input like in your current code. So changing it the way i shared above makes it a lot more accident safe :D https://github.com/serapath/task-messenger/blob/5c5617436f1307623252561daae124d7ffaa30b6/src/index.js#L78-L79

serapath commented 6 months ago

feedback 2024.04.29 (3/3)

refactoring / code review stuff

For example the following code we currently have in task-messenger/src/index.js is https://github.com/serapath/task-messenger/blob/5c5617436f1307623252561daae124d7ffaa30b6/src/index.js#L94-L123

And if we had the above, we definitely needed to change it, for example like the following:

  async function post_msg (message) {
    render_msg(messsage, 'right') // action "side effect"
    const channel = state.net[state.aka.task_explorer]
    channel.send({
      head: [id, channel.send.id, channel.mid++],
      type: 'post_msg',
      data: { content, username, chat_id: chat.id }
    })
    channel_up.send({
      head: [id, channel_up.send.id, channel_up.mid++],
      type: 'send',
      data: {from: username, users, to: 'task_messenger', type: 'on_post_msg', data_re: {content: content, chat_id: chat.id}}
    })
  }
  // ---
  function render_msg ({ data }, ...classes) {
    const { content, username } = data
    const element = document.createElement('div')
    element.classList.add('msg', ...classes)
    if (username === 'system') {
      element.classList.add('system')
      element.innerHTML = content
    } else {
      element.innerHTML = `
      <div class="username">${username}</div>${content}
      `
    }
    history.append(element)
    history.scrollTop = history.scrollHeight
  }

This already will allow us to also change the next function too

  async function on_post_msg (data) {
    const { from, data_re } = data
    const { content, chat_id } = data_re
    const channel = state.net[state.aka.task_explorer]
    const message = {
      head: [id, channel.send.id, channel.mid++],
      type: 'post_msg',
      data: { content, username: from, chat_id }
    }
    channel.send(message)
    if (chat && chat_id === chat.id) render_msg(msg)
  }

Maybe we can rename "on_post" and "on_post_msg" to:

open questions:

An open question is where should we STORE the history of messages we send and receive?

There is the "chat view", which updates based on what task is selected. If a task chat room has a long history, we might again want to lazy load and change based on scroll behavior what is even added to the DOM. If a task chat room is switched, why would we load everything into memory or keep non-visible long task rooms in memory?

Of course the "task messenger component" which updates the task view should probably store all messages based on chat rooms...

...but then again, that would be multiple lists per task, one for each peer in the chat (e.g. bob, ana, eve, ...)

An alternative would be to store the chat history lists (=arrays) inside each task component instance and when when a different task becomes active, then we retrieve the relevant messages for the "chat view" from there?

... Maybe the best option would be to have a scoped database. A database, where the task messenger creates the root instance, e.g. const db = taskdb() And then passes a "sub instance" to the main tasks of the task messenger, e.g. const subdb = db.sub(...)

That way, a specific task would only have access to store/read/do anything with it's own messages related to it's own task AND it could create further "views" or sub db intsances for further nested sub tasks.

When a different task gets focused by the user: The task messenger component, which is responsible for updating the "chat view", might receive an focus event from the new focused task or sub task and could then load that specific sub tasks database instance based on that sub tasks ID, e.g. const tdb = db.get(task_id) Now - based on user scrolling, etc... the task messenger could load/lookup tasks to populate the right elements of messages into the "chat view" :-)

Please share your recommendations as well. Any of the options above or further options should be considered and compared.

In the end, when we try to implement it and do the scrolling and lazy loading and updating of the "chat view", we will probably notice what data structures lend themselves best and how for example a taskdb API should look like to support what we need smoothly :-)

Basically the following would be touched, even after refactoring:

  async function open_chat ({ data }){
    chat = { data: data.chat, id: data.id }
    history.innerHTML = ''
    chat.data.forEach(msg => { // @INFO: here!
      const classes = []
      if (msg.username === username) classes.push('right')
      render_msg(messsage, ...classes) 
    })
    const channel = state.net[state.aka.chat_input]
    channel.send({
      head: [id, channel.send.id, channel.mid++],
      type: 'activate_input',
      data: 'Type a message'
    })
    update_status(data)
  }
  // and separate function
  function update_status (data) {
    const title = footer.querySelector('.title')
    title.innerHTML = data.name
    const input = footer.querySelector('.input')
    input.innerHTML = `Inputs: ${data.inputs ? data.inputs.length : '0'}`
    const output = footer.querySelector('.output')
    output.innerHTML = `Outputs: ${data.outputs ? data.outputs.length : '0'}`
    const task = footer.querySelector('.task')
    task.innerHTML = `Tasks: ${data.tasks ? data.tasks.length : '0'}`
  }

Above, where the comment says @INFO is the place where the task_messenger instance gets access to the relevant chat log messages.

In this context WE SHOULD differentiate between

  1. the source lists, e.g. { '': [], [ana_ID]:[], ... }
  2. the index cache, e.g. const chatroom = []

Now of course, this can be an internal part of taskdb or "sub task db's", but in order to compute chatroom array, we need to process the "source lists" and every time a new message is submitted by the task messenger user or by a peer, the source lists get updated and those updates need to be processed to update the "chatroom" index cache.

The source lists can always be reprocessed (replayed) from first to last to ideally deterministically re-generate the chatroom, but to do this every time we want to see the chat view of a task room, we would need to repeat the computation every time, so we rather want to think about the chatroom as a cache so we don't have to repeat calculations.

finaly: Now the chatroom is just an array or something inside the taskdb that could be just an array. All we need is to make sure we can efficiently lookup the messages we want to render based on user scrolling, lazy loading, etc... :-)


chat input component

all components, not just this one should utilize the state object to store any kind of state it uses, e.g. see https://github.com/serapath/task-messenger/blob/5c5617436f1307623252561daae124d7ffaa30b6/src/node_modules/chat_input/chat_input.js#L21-L24

which should probably be rewritten to the following:

async function chat_input (opts, protocol) {
  // ----------------------------------------
  // ID + JSON STATE
  // ----------------------------------------
  const id = `${ID}:${count++}` // assigns their own name
  const status = {}
  const state = STATE.ids[id] = { id, status, wait: {}, net: {}, aka: {} } // all state of component instance
  status.name = 'chat_input'
  status.shift_status = true
  status.textmode = "msg"
  status.username = opts.host

  // ----------------------------------------
  // PROTOCOL
  // ----------------------------------------
  // ...

  // ... 
}

A strong indicator would have even been, that you already called it shift_status

Now technically, we could also make up some more properties of the state object to group things, maybe state.db to associated more data or maybe state.db contains a key or symbol unique to a component instance, which can be used to load a taskdb instance for a specific symbol, so the symbol is the key to lookup the right sub data set for a specific component instance... we'll see.


The same technically applies to task_explorer.js

function task_explorer (opts, protocol) {
  // ----------------------------------------
  // ID + JSON STATE
  // ----------------------------------------
  const id = `${ID}:${count++}` // assigns their own name
  const status = {}
  const state = STATE.ids[id] = { id, status, wait: {}, net: {}, aka: {} } // all state of component instance
  const { host } = opts
  const name = 'task_explorer'
  let selected_task, query_result
  let chat_task, result, track
  const code_words = {inputs: 'io', outputs: 'io', tasks: 'task'}
  const add_words = {tasks: 'sup_tasks'}
  // ----------------------------------------
  // ...
}

to

// ----------------------------------------
const code_words = {inputs: 'io', outputs: 'io', tasks: 'task'}
const add_words = {tasks: 'sup_tasks'}
// ----------------------------------------
function task_explorer (opts, protocol) {
  // ----------------------------------------
  // ID + JSON STATE
  // ----------------------------------------
  const id = `${ID}:${count++}` // assigns their own name
  const status = {}
  const state = STATE.ids[id] = { id, status, wait: {}, net: {}, aka: {} } // all state of component instance
  status.host = opts.host
  status.name = 'task_explorer'
  status.selected_task = null
  status.query_result = null
  status.chat_task = null
  status.result = null
  status.track = null
  // ----------------------------------------
  // ...
}

So basically, what is actually part of a component instance, stays inside the component instance, e.g. on status., but general constants used by all instances does not need to be defined inside each instance, but should be part of the outer surrounding scope, basically the MODULE STATE, usually defined at the top of the file :-)


naming stuff

 const check = parent.children.length ? false : true

Saw this line and generally, a good convention we started using in those cases is, to use variable names which start either with is_ or has_, e.g. is_empty, is_array, is_expanded or has_element or has_user, etc...

It is really difficult to understand what check means.

So, rule of thumb would be to always choose names as specific as possible. As little abstract as possible.

Words that are more abstract as necessary are technically always correct, but practically will make you wonder what is happening and you need to read a lot of code to understand it instead of being able to guess it from the name alone.

Also, if we can maybe take more time to name things the way that most closely resemble what they are - i know naming is hard, it always takes some extra time to think about the name.

For example, return [element, tasks] seems confusing to me. Maybe it is appropriate, but somehow it is (i think as you mention in the worklog), an "empty div" to separate elements from each other, but if that is the case, couldn't we call it e.g. separator or section or delimiter or boundary or something, ideally shorter.

If i read "tasks", i imagine probably a list of elements or an array of task describing json data or a bullet point list of todos or something, so it needs additional cognitive load or effort to translate it to what it is.

alyhxn commented 6 months ago
  1. [x] Chat log data structure update
  2. [x] Enclose elements in shadow
  3. [x] Eliminate Duplication
  4. [x] Meaningful naming
  5. [x] Use state for variable storage