mhbeals / NewspaperDissector

The Newspaper Dissector is visualisation software for categorising and visualising newspaper pages based on topic, word count and source
MIT License
5 stars 0 forks source link

Newspaper Dissector v.0.3.5.2

The Newspaper Dissector is visualisation software for categorising and visualising newspaper pages based on category and word count. Based upon code built at the Software Sustainability Institute's 2017 Collaboration Workshop with the support of Geraint Palmer and Vince Knight (Cardiff University).

An example of its use can be seen in the Anatomy of a Newspaper blog series and the forthcoming article "Close readings of big data: Triangulating patterns of textual reappearance and attribution in the Caledonian Mercury, 1820-1840" Victorian Periodicals Review, 2018.

Requirements

This programme requires

Configuration of Input Data

Data File

Each row of the data file should include the following fields:

For example:

1820_06_15_Caledonian Mercury   1   1   1   Meeting r   "Sample Text"
1820_06_15_Caledonian Mercury   1   1   2   Loan/Borrow l   "Sample Text"
1820_06_15_Caledonian Mercury   1   1   3   Loan/Borrow l   "Sample Text"
1820_06_15_Caledonian Mercury   1   1   4   Loan/Borrow r   "Sample Text"

Or:

1820_06_15_Caledonian Mercury   1   1   1   Meeting r   1.5
1820_06_15_Caledonian Mercury   1   1   2   Loan/Borrow l   2.0
1820_06_15_Caledonian Mercury   1   1   3   Loan/Borrow l   3
1820_06_15_Caledonian Mercury   1   1   4   Loan/Borrow r   1.5

Example data.tsv and datanum.tsv files are included in the release directory, alongside the executable.

Legend File

Each row of the data file should include the following fields:

For example:

News    a
Commentary  b
Advertising c
Miscellany  d

An example legend.tsv file is included in the release directory, alongside the executable.

Please report any errors or exceptions to the author, M. H. Beals

Use

To use the Newspaper Dissector, place the executable file (NewspaperDissector.exe) in a directory/folder with one or more tab-separated (.tsv) data and legend data files. The files should have no headers and include only the raw data.

Once run, the programme will ask for your working directory. You must include the full path to your data files (usually the folder in which NewspaperDissector.exe has been placed) including the final "\".

The programme will then provide you with 3 options, plus the option to exit the programme.

A) Configure data files

This options allows you to change which files the programme accesses and how it interprets them.

F) Main data file

This options allows you to change the file name of the page/snippet data, as well as indicate if you have used text (from which a wordcount will be automatically calculated) or a numeric value in the final column. By default, the programme expects a file named data.tsv which contains text in the final column.

G) Legend data File

This options allows you to change the file name of the legend data or to manually enter labels and keys. By default, the programme expects a file named legend.tsv.

B) Configure visualisation

This option allows you to set (or change) the colour scheme of your visualisation. You currently have three options.

Samples of these can be found here

If you do not configure this option, you will be prompted to do so the first time you create a visualisation. You must use this option if you wish to change the colour scheme during subsequent runs.

C) Configure the python command

This option will allow you to replace python with another command in the final batch file, for example, python3.

D) Create a visualisation

This option create two python files, one containing the code to create the page visualisations and one to create the horizontal and vertical legend images. It will then attempt to run both programmes and create the PNG images. The programme is currently set to use the command "python" rather than "python3". If your python installation requires the later (or another variant) you will need to run the python command line configuration (above).

Changelog

v.0.3.5.2:

v.0.3.5.1:

v.0.3.5.0:

v.0.3.4.0:

v.0.3.3.0:

v.0.3.2.0:

v.0.2.2.0: