MartinPacker / filterCSV

Tools to manipulate CSV files in a format suitable for importing into various mindmapping programs - such as iThoughts, Freemind, and MindNode.
MIT License
32 stars 8 forks source link
colour csv-format hacktoberfest-accepted ithoughts-csv tree

filterCSV

iThoughts is a third-party application for creating and managing mind maps. It runs on iOS, iPad OS, Mac OS and Windows.

You can create a mind map either in the application itself or by importing files in a number of other formats. The most complete format is Comma-Separated Value (CSV). Being a text format, CSV can be programmatically created in a number of programming languages, such as Python.

The CSV format that iThoughts understands has a tree-like structure. A tree consists of nodes, which contain data as well as potentially child nodes. A node with no parent is called a root node. A node with no children is called a leaf node.

There can be multiple root nodes - and hence multiple trees - in an iThoughts CSV file. In which case it's better to call the ensemble a forest of trees.

As well as the nodes' tree structure, an iThoughts' CSV file can store for each node its colour, its position, its shape and other attributes. To a very limited extent the format is documented here. A better way to understand the format is to export a mind map from iThoughts as CSV and look at the resulting file.

An Introduction To The iThoughts CSV File Format

Here is a sample CSV file in the format required by iThoughts:

"colour","shape","level","level0","level1","level2"
00FFFF,,0,"A"
,triangle,1,,A1
,square,1,,A2
FF0000,square,2,,,A2A

and here is how it looks when imported into iThoughts:

This is obviously a very simple example, but it illustrates some features of the file format:

A more detailed description of the file format is given in iThoughts CSV File Format but this brief description should be enough to get you started.

In more complex cases other columns come into play.

About filterCSV

filterCSV is a set of tools to automatically edit a CSV file in the form used in iThoughts. filterCSV is written in Python 3.6+. It has been tested on a Raspberry Pi and a machine running macOS.

Based on matching regular expressions, plus a few other criteria, you can do things for matching nodes such as:

You can check the structure of the input CSV file is good for importing into iThoughts.

You can export the CSV file as a Markdown file consisting of headings and bulleted lists, and in a number of other formats.

NOTE: In this document we will use terms such as "mind map" and "tree". Structurally the data represents a tree. \ It might or might not be used for mapping your mind.

Using filterCSV

filterCSV reads from stdin and writes to stdout, with messages (including error messages) written to stderr. For example:

filterCSV '^A1$' 'triangle' < input.csv > output.csv

It's designed for use in a pipeline, where the input of one program can be the output of another.

Do not specify the input and output files as command parameters. Instead

Command line parameters instruct filterCSV on how to process the parsed input file to create the output file. The parameters are specified in pairs. Each pair consists of:

  1. A specifier. This is a regular expression to match. (A special value all matches any value.)
  2. An action or sequence of actions.

In the case where no action is expected you can code anything you like for the second parameter. A useful suggestion would be to code . for it.

Instead of using command line parameters you can code the commands in a file read in from Stream 3. See Command Files for more information on this, potentially more flexible, way of controlling filterCSV.

You can get some basic help by invoking filterCSV with no parameters. That help points to this README and the project on GitHub.

Specifiers

Specifiers are used to specify which nodes to operate on and can be in one of the following forms.

Notes:

~

filterCSV < input_file.csv > output_file.csv \
    check repairsubtree \
    @level:1 'triangle note'

Actions

Actions you can take include:

In the following action specifications are case-insensitive; If you specify, for example, an action in upper case it will be converted to lower case before being applied to matching nodes.

You can, in most cases, specify a sequence of actions. You can separate them by spaces or commas. If you specify multiple actions you probably need to surround them with a pair of single quotes.

Colour Numbers

A colour number is a 1- or 2-digit number. It is specified relative to the top left of iThoughts' colour palette. (1 is the first colour in the palette.)

You can also specify nextcolour, nextcolor or even nc and filterCSV will select the next colour in iThoughts' colour palette. samecolour, samecolor or sc can be used to specify the same colour again.

Colour RGB Values

This is a hexadecimal 6-character representation of the colour, in Red-Green-Blue (RGB) format. For example FFAAFF.

Automatic Colouring

Rather than either using Colour Numbers or Colour RGB Values you might be able to automate colouring nodes.

Automatic node colouring requires you to code a capturing group inside a regular expression.

Here is an example:

filterCSV '-(.?)-' autocolour < test1.csv > test2.csv

The capturing group is the portion of the regular expression inside the round brackets. When filterCSV processes this command it keeps track of the values that match the capturing group and uses them to consistently colour the nodes.

Notes

  1. You can specify autocolour, autocolor, or even ac.
  2. You can use multiple capturing groups, for example RC: (.*) SC: (.*). filterCSV uses all the groups to form a key; When any of the capturing groups' values changes a new colour is selected.

    Delete

delete deletes the matching node and all its children.

Keep

keep retains the matching node, all of its children, and its parent, grandparent, great-grandparent, etc. The idea is to retain a workable tree.

For example

filterCSV 'EXCPs' keep < input.csv > output.csv

would retain any nodes which match the string "EXCPs", and all the nodes below them. In addition, to ensure the tree remained valid (for import into iThoughts) any nodes leading from the root (level 0) to the matching nodes would be retained.

Note: You can use regular expression alternation to keep multiple subtrees. For example:

filterCSV 'A1|X' keep < input.csv > output.csv

where | means either the term to the left (A1) or the term to the right (X) can be used to match.

If you use keep in a filterCSV action you can't use anything else. For example, you can't use triangle. You can use another specifier, perhaps allwith triangle to get the same effect.

Shapes

You can specify a shape for matching nodes using one of the names in the list in iThoughts Shape Names.

For example:

filterCSV '^CF' triangle < input.csv > output.csv

would change the shape of any nodes which match the string "CF" (but having no characters preceding "CF") to a triangle.

You can also specify nextshape, or ns and filterCSV will select the next shape in iThoughts' set of shapes. sameshape or ss can be used to specify the same shape again.

Positions

Positions are specified in the form {x,y} where the braces are necessary.

At present setting the position only seems to work for Level 0 (root) nodes. You can have as many Level 0 nodes as you like.

For example:

filterCSV 'A Root Node' '{100,200}' < input.csv > output.csv

would move a level 0 whose name including the string 'A Root Node' to position (100,200).

Icons

You can add an icon to matching nodes using one of the names in the list in iThoughts Icon Names.

For example:

filterCSV 'Done' tick < input.csv > output.csv

would add a tick icon to any nodes which match the string "Done".

Note: A node can have more than one icon so specifying tick in the above example would not replace any other icon; It would add a tick icon to any existing ones.

Priority

You can set a node's priority with priority:n or prio:n. You can unset it with nopriority or noprio.

For example:

filterCSV 'Unimportant' prio:5 < input.csv > output.csv

will set the priority for any node matching "Unimportant" to 5.

Progress

You can set a node's progress with progress:n or prog:n. You can unset it with noprogress or noprog.

For example:

filterCSV 'Got nowhere' prog:0 < input.csv > output.csv

will set the progress for any node matching "Got nowhere" to 0%.

Removing Notes, Shapes, Colours, Positions, Icons, Progress, And Priority

If you specify noshape, nocolour, nonote, noposition, or noicons the corresponding attribute is removed from matching nodes.

Similarly priority or progress can be unset with nopriority, noprio, noprogress or noprog.

Most usefully you could specify this with a match condition of all to reset an entire column. For example, nonote could clear all the notes from a mind map - to prepare it for exporting from iThoughts.

Eliminating A Level

Suppose you have some nodes at level 1 and you want to make them all level 0, retaining their subtrees. With promote you can do this.

If you specify, for example

filterCSV promote 2 < input.csv > output.csv

The nodes at level 1 are deleted and all their direct children move up to level 1. These children might be the root of subtrees. All nodes in the subtrees are also promoted by 1 level.

Computing Statistics About A Mind Map

If you specify stats it will write basic statistics to an output file in one of the following forms:

The statistics are (for each level):

Here is an example - produced by specifying stats text:

Level Nodes Distinct Nodes
    0     2              1
    1     2              2
    2     1              1
    3     1              1

Merging Nodes Into Their Parent node

You can merge a matching node into its parent as a bullet.

To do this specify asbullet. For example

filterCSV ^A1$ asbullet < input.csv > output.csv

will merge any bullet whose text or note is 'A1' with its parent. The text of the note will be merged in, with the two characters '* ' denoting it's a bulleted item.

Sorting Child Nodes

You can sort the child nodes of selected nodes. The sort will be alphabetical and ascending.

For example

filterCSV ^A1$ sort < input.csv > output.csv

will sort all the children of the nodes whose text is "A1".

Reversing The Order Of Child Nodes

You can reverse the order of the child nodes of selected nodes.

For example

filterCSV ^A1$ reverse < input.csv > output.csv

will reverse the order of the children of the nodes whose text is "A1".

You can use reverse after sort to make the sort effectively alphabetically descending.

Spreading Out Level 0 (Root) Nodes

If you import a CSV file into iThoughts without specifying positions in the file iThoughts will place all the Level 0 (root) nodes on top of each other. \ This is probably not what you want. \ filterCSV can spread out the Level 0 nodes - either horizontally or vertically.

For example, if you specify vspread 500 filterCSV will set the positions of the Level 0 nodes 500 units apart - one above the other.

For example, if you specify hspread 1000 filterCSV will set the positions of the Level 0 nodes 500 units apart - spaced out horizontally.

In both cases the children will be arranged as normal, relative to the root nodes.

vspread and hspread set the values for these nodes in the "position" column in the CSV file.\ Their format is of the form "{1000,0}". \ (In this example "1000" is the horizontal offset and "0" is the vertical offset.) \ If you specify vspread or hspread they will overwrite all Level 0 nodes' positions.

Replacing Strings

Regular expressions can be used for searching for and replacing strings - if these strings are in a format the Python re module's sub method understands.

For example, the author's Production code emits iThoughts-friendly CSV files where the newline character ("\n") is generated as a semicolon. \ filterCSV can readily replace every semicolon by a newline character. For example:

filterCSV ';' sub:$'\n' < input.csv > output.csv

Here the shell renders $'\n' as a newline character. \ iThoughts honours these newline characters in rendering the nodes.

To indicate you want a matched string replaced code the action beginning with sub:. \ For example, if you want every occurrence of "A" replaced with "B" code:

filterCSV 'A' 'sub:B' < input.csv > output.csv

You can use references to matching groups. \ For example:

filterCSV '(\d)' 'sub:\1\1' < input.csv > output.csv

replaces every numeric digit with two copies of itself. \ Here the capturing group, marked by the bracketed expression (\d), is referred to as "Capturing Group 1". \ The \1 in the replacement refers to this capturing group.

In general you can use the full flexibility of Python 3's re.sub() method.

Match Statistics

When filterCSV checks nodes against each specifier you get statistics for how many nodes matched the criterion. You also get the number of nodes that matched no criteria in the run. \

Here is a sample:

Match Statistics
----------------
Match count for RegEx 'A2': 3
Match count for RegEx 'A1': 2
Match count for RegEx '^A1': 1
Match count for @level:1: 2
Match count for @noshape: 0
Match count for @shape:square: 1
Match count for RegEx '^A$': 1
Remaining unmatched: 0
----------------

Input Files

Input files can be in one of six formats:

Nesting Level Detection

When parsing a Markdown or tab/space-indented non-CSV file the first line with leading white space (spaces and/or tabs) is used to detect the indentation scheme: \ Any white space on this line is used as a single indentation level marker. \ It is expected that all lines are indented in the same way.

For example, if the second line starts with two spaces this is taken to indicate that every line will have zero, two, four, etc spaces:

If the file is detected to contain Markdown, leading dashes and asterisks followed by a single space are removed. \ (These are bulleted list markers in Markdown.) \ For example:

* A
    * A1

leads to the input file being interpreted as a level 0 node with text "A" and its child node with text "A1".

You can also specify nesting levels using one of the two forms of heading that Markdown supports. \ Consider the following example:

# Fruit

## Citrus

* Lemon
* Orange
* Vaguely orange-like
    * Mandarin
    * Satsuma

In this example:

The resulting CSV file contains:

"dueDate","startDate","effort(hours)","priority","progress","icons","colour","note","position","shape","level","level0","level1","level2","level3"
"","","","","","","","","","","0","Fruit"
"","","","","","","","","","","1","","Citrus"
"","","","","","","","","","","2","","","Lemon"
"","","","","","","","","","","2","","","Orange"
"","","","","","","","","","","2","","","Vaguely orange-like"
"","","","","","","","","","","3","","","","Mandarin"
"","","","","","","","","","","3","","","","Satsuma"

Imported into iThoughtsX, the resulting tree looks like:

For an XML input file the nesting level is in the data stream; Elements' children are at a deeper nesting level. \ On import child nodes are created for the element's value (if it has one) and any attributes. \ These are colour coded and free-standing nodes marked "Element", "Value", and "Attribute" are added as a legend.

Metadata

When importing Markdown text it can contain metadata that isn't part of the structure of the data. \ Here is an example of such a file:

font-size: 14
creation-date: 2020-08-29

   * A
       * A1
       * A2

Metadata consists of key/value pairs, one per line, at the beginning of the file. After the last line of metadata is a blank line. \filterCSV performs two actions on encountering metadata:

Checking

If the input data is not a CSV file filterCSV performs two checks on it:

These checks are intended to help debug an indentation problem.

If you specify check - in place of a regular specifier - filterCSV will check the level column of the data. It will detect bad levels in the following way: If the level of a node is greater than one more than the parent node it will consider this to be an error - which will be reported.

If a level error is detected one of three things can happen:

As an example, you might code

filterCSV check repair < input.csv > output.csv

Handling CSV Files Not In The Format iThoughts Expects

You can import a CSV file and the tree structure is described by how many empty cells are to the left of the first cell with text in.

If there is more than one cell in the line with text in the last such cell is used to form a note for the node.

Here is an example.

"A"
,"A1"
,"A2","This is a note"
,,"A2A"

In the above node "A" is at level 0, nodes "A1" and "A2" are at level 1 - and "A2" has a note ("This is a note"), and "A2A" is at level 2.

Output Formats

While generally you would write a CSV file for importing into iThoughts, you can also export the data to:

Markdown Output

A CSV file can be exported to Markdown with each level rendered according to the following rules:

You can specify Markdown output by invocations such as

filterCSV markdown 3 < myfile.csv > myfile.md

or

filterCSV markdown '2 3' < myfile.csv > myfile.md

In the first case three levels of heading are required, starting with heading level 1. (Heading level 1 is the default).

In the second case two levels of heading are required, starting with heading level 3.

HTML Output

You can export to HTML as either a nested list or a table. Colour is preserved on output, as are notes.

You can specify HTML nested list output by invocations such as

filterCSV html list < myfile.csv > myfile.md

You can specify HTML table output by invocations such as

filterCSV html table < myfile.csv > myfile.md

The table will have extra columns if any of the nodes in the tree have any of the following attributes. \ Each column has its own class attribute - which can be used for styling with CSS:

Atribute Class
Due Date dueDate
Start Date startDate
Effort effort
Priority priority
Progress progress

Freemind And OPML XML Output

filterCSV can output to Freemind XML format, including notes and colours:

filterCSV xml freemind < myfile.csv > myfile.mm

filterCSV can output to OPML XML format, but support for notes and colours by other programs is mixed. For example Omnifocus will create a custom "Notes" column:

filterCSV xml opml < myfile.csv > myfile.opml

GraphViz .dot Format

filterCSV can export in a format compatible with the GraphViz .dot language. It creates a digraph (directed graph). Here is a sample output file:

digraph {
 rankdir=TB
  N1[label="A"]
  N2[label="A1"]
  N1 -> N2
  N3[label="A2"]
  N1 -> N3
  N4[label="A2A",fillcolor="#00ff00",shape="rectangle",style="rounded,filled"]
  N3 -> N4
  N5[label="A2A1",fillcolor="#00ff00",shape="rectangle",style="rounded,filled"]
  N4 -> N5
  N6[label="X",shape="square"]
}

Here a number of nodes, whose names begin with "N", are defined. Additionally, directed links (arcs with arrows on them) are defined between them. \ filterCSV preserves colours and most shapes on export to .dot format.

Here is a sample invocation:

filterCSV digraph vertical < test.csv > test.dot

You can use the dot command (part of GraphViz) to turn this into a PNG graphic:

dot -Tpng test.dot > test.png

In the above example the parameter vertical was used to align the root nodes next to each other, with descendants down the page. \ If you specify any other value, for example 'horizontal' or '.' the alignment will be horizontal. (You can use v for short, for vertical.)

Indented Text

You can write out the data as a text file where indentation is used to denote levels in the tree hierarchy. \ Use the indented command to write out the text in this format. \ There are numerous options for how to indent the text:

Here are two examples of usage:

filterCSV < input.csv > output.txt indented space:4

will write out a file where four space characters are used for each level of indentation.

filterCSV < input.csv > output.txt indented "--"

will write out a file where two dashes are used for each level of indentation.

iThoughts CSV File Format

The CSV format that iThoughts understands has a tree-like structure, within a table. As well as the tree of nodes, colour, position, node shape and other attributes of a node can be specified in the CSV file. To a very limited extent the format is documented here. A better way to understand the format is to export a mind map from iThoughts as CSV and look at the resulting file.

The first row of the table contains headings iThoughts uses to understand the layout of the following rows. Each subsequent row represents a node.

In summary, the iThoughts CSV file format has, at a minimum, the following columns:

This would be for a mind map with only (isolated) top-level nodes. An example like this would be

level,level0
0,Text for this sole node

The iThoughts CSV format is tabular.

But usually you want more than one level of node:

level,level0,level1,level2
0,Top-level node
1,,Next-level node
2,,,Leaf node at level 2
1,,Another intermediate-level node

Here the structure is more apparent:

filterCSV ensures the "level" and "leveln" columns are present - to the extent needed by the tree. It also always adds the following columns, before the "level" column:

These extra columns are filled in to allow filterCSV to do interesting things with the attributes they represent, such as

While iThoughts can tolerate CSV files where trailing empty cells are suppressed, filterCSV includes them.

Command Files

Instead of specifying commands as pairs of parameters on the command line you can use Stream 3 to point to a file containing the commands.

For example:

filterCSV < input.csv > output.csv 3< commands.txt

Here the 3< commands.txt specifies the commands will be read in from the file commands.txt.

The format is very similar to the command line format for specifiers and actions. For example:

'^A1$' 'triangle FF0000' // A1 nodes get the red triangle treatment

In the above any characters before the first space are treated as the specifier. They do not have to be in quotation marks. Any characters after the first space are treated as the actions - up to just before the double slash.

Note: The specifier and the actions must be on the same line.

In the example above a comment was introduced by //. Any characters after this on the same line are treated as a comment and ignored. \ You can comment out a whole line with // - which might be useful for exploration purposes. Blank lines are also ignored. \ Comments aren't feasible with command line parameters - so using a command file like this might be preferred.

Test Files

tests/README.md describes test files that you can study to become familiar with filterCSV.

iThoughts Shape Names

The following shape names are defined by iThoughts.

auto
rectangle
square
rounded
pill
parallelogram
diamond
triangle
oval
circle
underline
none
square bracket
curved bracket

iThoughts Icon Names

The following icon names are defined by iThoughts. You can use them in two places

The names below are in the sequence they appear in the iThoughts icon palette.

tick
tickbox
p0
p1
p2
p3
p4
p5
p6
p7
p8
p9
signal-flag-red
signal-flag-yellow
signal-flag-green
icon-signal-flag-black
icon-signal-flag-blue
icon-signal-flag-orange
icon-signal-flag-purple
icon-signal-flag-white
icon-signal-flag-checkered
icon-hat-black
icon-hat-blue
icon-hat-green
icon-hat-red
icon-hat-white
icon-hat-yellow
icon-calendar1
icon-calendar7
icon-calendar12
icon-calendar31
icon-calendar52
arrow-down-blue
arrow-left-blue
arrow-right-blue
arrow-up-blue
arrow-up-green
arrow-down-red
stop
prep
go
smiley_happy
icon-smiley-neutral
smiley_sad
icon-money
currency-dollar
currency-euro
currency-pound
currency-yen
icon-currency-won
icon-currency-yuan
hand-yellow-card
hand-red-card
hand-stop
hand-thumb-down
hand-thumb-up
question
icon-questionmark
icon-information
icon-exclamationmark
alert
icon-add
cross
sign-forbidden
sign-stop
idea
icon-camera
auction-hammer
bell
bomb
dynamite
fire
hourglass
target
view
icon-airplane
icon-alarmclock
icon-bug
icon-businessmen
icon-car
icon-clients
icon-cup
icon-data
icon-desktop
icon-earth
icon-flash
icon-gear
icon-heart
icon-key
icon-lock-open
icon-lock
icon-mail
icon-pin
icon-printer
icon-scales
icon-star
icom-telephone
icon-pencil
icon-alarm
icon-book
icon-certificate
icon-cloud
icon-compasses
icon-dice
icon-folder
icon-document
icon-male
icon-female
icon-newspaper
icon-paperclip
icon-presentation
icon-signpost
icon-step