Gemino is a file duplicator with advanced capabilities useful for copying forensics datasets and doing logical acquisitions.
This was born as a small project to help in my workflow, and has since evolved into a more complete tool with support for log files and forensic containers.
Using gemino you can optimize the copy of large dataset to multiple drives for backup or distribution purposes. By reading the source data only once gemino does not suffer from source bottlenecks. Traditional copy by a script or via the OS GUI reads the source once for each destination; with a parallel copy to many disks this means parallel reads which slow the overall performance.
Reading the source data only once for the entire process (copy and hashing) allows gemino to increase drastically the copy performance, especially on slow source devices such as Network drives or HDDs.
Basic Copy to Multiple Destinations.
Copy to AFF4 Container for Single Destination.
Open AFF4 Container to preview and export content.
To reduce impact due to point 3 the containers are created using only ZipSegments independently of file size to improve compatibility. If the tool does not support AFF4-L standard, or fails to process the data, the container can still be imported as ZIP archive.
Gemino has a basic viewer for containers featuring:
Do not mix destination devices with different I/O - write speeds! The overall speed will be that of the slowest device!
gemino uses a "multicast" like approach, as such the data is read buffered from the source and sent to the destination disks in blocks of 64MB. Each write process is independent for the block size (runs in a dedicate thread); however before passing to the next block all devices need to be finished with the write. As such fast devices (eg. a USB SSD ~300/400MB/s) would have to wait that the buffer is copied to a slow device (eg. an USB Key ~50MB/s).
When copying ensure the target devices are as close as possible in terms of performance, better even if the same model.
Hash verification is for the moment implemented serially. As hashing is a CPU bound operation multithreading in python would not be useful (all the threads are bound to the one and same core) (see CPython implementation details here for more information about that) Solving this would need implementing multiprocessing, which will be for another time, feel free to submit a PR
Just download the binaries for your system
Python 3.9 Required
git clone https://github.com/fservida/gemino
cd gemino
pip install -r requirements.txt
python src/main/python/main.py
If having issues with running, please look at .github/workflows/github-actions-package.yml and check how we build for your OS.
On platforms with Apple Silicon use the following to create the needed environment for x64 binaries using rosetta (ensure rosetta is installed before by running any x64 binary):
CONDA_SUBDIR=osx-64 conda create -n rosetta python # create a new environment called rosetta with intel packages.
conda activate rosetta
python -c "import platform;print(platform.machine())"
conda config --env --set subdir osx-64
Supported Platforms
Other Platforms This project uses PyInstaller for packaging, if binaries for your platform are not available (or compatible) you can just download the source and package it yourself. Please share the packaging commands for others to use, we'll all be grateful (づ ̄ ³ ̄)づ .
When submitting PRs please ensure your code is well commented, just like mine. (just kidding, mine is a mess too)
Feel free to open issues and leave your feedback
By using "gemino" you are consenting to our policies regarding the collection, use and disclosure of personal information set out in this privacy policy.
We do not collect, store, use or share any information, personal or otherwise.
If you email the developer for support or other feedback, the emails with email addresses will be retained for quality assurance purposes. The email addresses will be used only to reply to the concerns or suggestions raised and will never be used for any marketing purpose.
We will not disclose your information to any third party except if you expressly consent or where required by law.
If you have any questions regarding this privacy policy, you can email gemino@francescoservida.ch