I think this is more food for thought for longer term versions, to keep in mind.
Would it make sense to have an abstraction that facilitates the breakdown of data structures such as Markdown, code, possibly at different levels and even hierarchy?
Examples
Markdown
# A dog story
Bob the dog was running in the grass...
## Dog colour
Bob hair was yellow...
# A cat story
Alice the cat was running in the grass...
## Cat colour
Alice hair was black...
['# A dog story\n\nBob the dog was running in the grass...\n\n## Dog colour\n\nBob hair was yellow...', '# A cat story\n\nAlice the cat was running in the grass...\n\n## Alice colour\n\nAlice hair was black...']
Code / machine language
def foo():
print("bar")
foo()
print("baz")
import os
if True:
print("qux")
def something():
print("something")
I think this is more food for thought for longer term versions, to keep in mind.
Would it make sense to have an abstraction that facilitates the breakdown of data structures such as Markdown, code, possibly at different levels and even hierarchy?
Examples
Markdown
Code / machine language
Hierarchies
Just an initial thought, think "["h1": "A dog story", children: [...],...]", e.g. a tree data structure could make sense
More thoughts
Why would we split at different levels?
PS: If I remember correctly, saw somewhere that white spaces should be removed with current openai model for better perf (TODO: add ref)