A short excursion away from wed to better understand piece-chains
More detailed sources at the end of this document.
Piece-chains are a structure for managing text.
Imagine loading a big piece of text into a buffer. If you want to make edits to that text, all the changes you make are directly on the buffer. This may mean splitting or copying parts of the buffer whenever you make edits. It also means that time to load or edit a file is function of the file size.
If we create an abstraction layer over the piece of text, all of our edits are on the abstraction layer.
Piece-chains are an abstraction layer that view a text as a chain of pieces. Each piece is a description of source of data, a starting point in that data, and a length. A piece chain is a chain of these pieces which together make up a text. Now editing the text is simply editing these descriptions: for example to delete text we simple decrement the value of the length. The time to edit or load a file is now a function of the number of pieces, regardless of whether the file is 1 line or 1000000000 lines.
A piece is an abstract data type of two parts: Piece and Piece_Descriptor.
A Piece is a pointer to a Piece_Descriptor:
typedef struct {
/* ... */
} Piece_Descriptor;
typedef Piece_Descriptor *Piece;
A Piece_Descriptor is a struct with a file descriptor, a position, a length, pointer to the next Piece_Descriptor, and a pointer to the previous Piece_Descriptor:
typedef struct piece_descriptor {
int file; /* file descriptor */
int pos;
int len;
piece_descriptor *next;
piece_descriptor prev;
} Piece_Descriptor;
The collection of Piece_Descriptors forms a doubly-linked list. This means that the Piece_Descriptors needn't be contiguous in memory.
A piece chain is a created by pointing the .next value of piece descriptor A at another piece descriptor B. Piece descriptor B should also have its .prev value set to a pointer to piece descriptor B and, if it is the final piece descriptor, it's .next value should be set to NULL.
Piece
load_file(char *name, Piece prev, Piece next)
{
Piece p = p_alloc();
p->file = open(name, O_RDONLY, 0);
p->len = file_size(p->file);
p->prev = prev;
if (prev != NULL) prev->next = p;
p->next = next;
if (next != NULL) next->prev = p;
return p;
}
/* ... */
Piece first_piece = load_file("as.txt", NULL, NULL);
Piece second_piece = load_file("bs.txt", first_piece, NULL);
Once we have a piece chain constructed, there a few simple things we want to do with it:
int text_pos_to_pc_pos(Piece p, int text_pos);
These simple things will require slightly more code than if we were working with a buffer but it will pay off in performance.
Location Text Piece Piece_Descriptor
t_* functions are for texts
/* returns Text struct with single piece */
Text t_load_file(char *name);
/* prints entire piece chain (text) */
void t_print(Text t);
/* frees entire piece chain (text) */
void t_free(Text t);
e_* functions are for text editing
/* deletes part of text */
void e_delete(Location loc);
p_* functions are for pieces
/* returns address of piece sized memory on heap */
Piece_p_alloc(void);
/* returns new piece */
Piece p_load_file(char *name, Piece prev, Piece next);
/* prints a piece to stdout */
void p_print(Piece p);
/* frees a single piece's allocated memory */
void p_free(Piece p);
/* Returns a Location with corresponding .piece and .offset to pos */
Location p_location_of(Text t, int pos);
/ takes a file descriptor and returns it size / int file_size(int file);