An automated tool for Validating OSINT. This forms part of the final step of OSINT production as detailed by NATO's open source handbook (2001). This is a research artefact for my Dissertation at the University of Portsmouth
See the results of the different Entity Recognition language models here. Note how the spaCy standard 'en_core_web_sm' NER model struggles to recognise military information compared to the model used for this project using the Defence Science and Technology Laboratory 're3d' dataset.
Note First, please attempt to use the Google Colab, more info below.
git clone https://github.com/UP2040499/auto-osint-v.git
conda --version
cd ~/<install directory>/auto-osint-v
conda env create -f environment.yml -n auto-osint-v-python38
mamba env create -f environment.yml
eval "$(conda shell.bash hook)" #copy conda command to shell
conda activate auto-osint-v-python38
python -m auto_osint_v
Open an 'Anaconda Powershell Prompt' from the Start Menu, then run the following:
conda init powershell conda activate auto-osint-v-python38 python -m auto_osint_v
python -m auto_osint_v <ARGS>
The following descriptions can also be found by running auto_osint_v -h
.
-s/--Silent
Assumes you have already entered the intelligence statement
here-n/--NoEditor
Input intelligence statement into command line rather than into text editor.--html
Output will be in HTML (default: csv).-m/--markdown
Output will be in markdown (default: csv).-f/--FileToUse
Specify the file to read the intelligence statement from-p/--output_postfix
Specify the output file's postfix, e.g. 'output3.txt' rather than default
'output.txt'python -m auto_osint_v
This reads the statement from the existing intelligence file, and output the results in a markdown file called 'output0.md'.
python -m auto_osint_v -s -m -p 0
The postfix (0 in this case) is useful if you are running the tool multiple times and want to save the results separately.
Previously, I recommended using Google Colab to run this tool. However, the default machine in the Google Colab performs worse than most local machines would (this is likely due to CPU limits in place). You can pay for a higher-performing machine with a GPU, this does improve performance.
The Google Colab can be found here
The reason it is recommended to use Google Colab is because it runs the tool remotely. While performance on a local machine may be better, most of my (underpowered) machine's available resources (CPU, RAM) were utilised by the tool.
If the tool struggles to run on your local machine use Google Colab to avoid hogging your computer's resources.