UTMediaCAT / mediacat-backend

0 stars 0 forks source link

Build Status

mediacat-backend

Post Processor Usage

Before running, make sure to remove the testing files from the DomainOutput and TwitterOutput directories

cd Post-Processor
python3 processor.py

Advanced Usage

The post-processor also supports multi-processing for more efficient performance, to utilize this feature, run python3 processor.py -num_procs=x -limit=y where x is the number of processes to use and y is the memory limit (in bytes) of the local data after which it will be written to disk. Increasing -limit will prevent memory errors but may reduce performance speed. Recommended usage: python3 processor.py -num_procs=10 -limit=5000000

Required files and folder structure within Post-Processor directory:

output.xlsx will include an row for URL x from DomainOutput iff:

Note: read_from_memory flag is can be manually turned on and off on processor.py main. If picking up the processor from a previous break, then run the program with read_from memory set to True.

archived branches

Test was a branch that was archived.

Can be restored by the following command: git checkout -b Test archive/Test

It was archived like this:

Another great resource