Closed bdcallen closed 4 years ago
@iangow Here is my new crontab, with the new lines as indicated
CODE_DIR=/home/bdcallen/asxlisting
EDGAR_CODE_DIR=/home/bdcallen/edgar # new line in crontab
ASXLISTING_DIR=/media/igow/6TB/data
ABN_LOOKUP_DIR=/home/bdcallen/abn_lookup
ASIC_DIR=/home/bdcallen/asic
26 17 * * * $CODE_DIR/./asx_prev_day_cronjob.sh
00 19 * * * $EDGAR_CODE_DIR/update_edgar.sh # new line in crontab
00 6 * * 5 $ABN_LOOKUP_DIR/./abn_lookup_cronjob.sh
00 3 * * 4 $ASIC_DIR/./asic_bulk_extract_cronjob.sh
I have also changed update_edgar.sh
in my directory to
#!/usr/bin/env bash
echo "Running get_filings.R ..."
./$EDGAR_CODE_DIR/get_filings.R
echo "Running get_accession_nos.R ..."
./$EDGAR_CODE_DIR/get_accession_nos.R
echo "Running get_filer_ciks.R ..."
./$EDGAR_CODE_DIR/get_filer_ciks.R
echo "Running get_item_nos.R ..."
./$EDGAR_CODE_DIR/item_nos/get_item_nos.R
echo "Running get_item_no_desc.R ..."
./$EDGAR_CODE_DIR/item_nos/get_item_no_desc.R
# ./get_server_logs.R
echo "Running scrape_filing_docs.R ..."
./$EDGAR_CODE_DIR/filing_docs/scrape_filing_docs.R
Note I've introduced a new environmental variable, EDGAR_CODE_DIR
, to not clash with a similar variable for asxlisting
.
I've scheduled this cronjob for 7pm each night. I think this is a rather good time, as it corresponds to around 2am over on the east coast of the US. I will close this if the cronjob works well after it outputs to dead.letter
.
@iangow After fixing up some errors in my bash script above
#!/usr/bin/env bash
echo "Running get_filings.R ..."
$EDGAR_CODE_DIR/./get_filings.R
echo "Running get_accession_nos.R ..."
$EDGAR_CODE_DIR/./get_accession_nos.R
echo "Running get_filer_ciks.R ..."
$EDGAR_CODE_DIR/./get_filer_ciks.R
echo "Running get_item_nos.R ..."
$EDGAR_CODE_DIR/./item_nos/get_item_nos.R
echo "Running get_item_no_desc.R ..."
$EDGAR_CODE_DIR/./item_nos/get_item_no_desc.R
# $EDGAR_CODE_DIR/./get_server_logs.R
echo "Running scrape_filing_docs.R ..."
$EDGAR_CODE_DIR/./filing_docs/scrape_filing_docs.R
I managed to test the running of the script successfully through cron this afternoon, with this being the cron's output to dead.letter
Running get_filings.R ...
Updating data for 2019Q4...
Running get_accession_nos.R ...
Running get_filer_ciks.R ...
Running get_item_nos.R ...
Processing batch 1 of 3 ... 41.876 seconds
Processing batch 2 of 3 ... 84.587 seconds
Processing batch 3 of 3 ... 38.914 seconds
Running get_item_no_desc.R ...
Running scrape_filing_docs.R ...
Processing batch 1
Writing data ...
458.4041 seconds
Processing batch 2
Writing data ...
91.52212 seconds
Processing batch 3
Writing data ...
84.79389 seconds
Processing batch 4
Writing data ...
93.54987 seconds
Processing batch 5
Writing data ...
95.79431 seconds
Processing batch 6
Writing data ...
404.7744 seconds
I have set the main edgar cronjob to run at 9pm every night
26 17 * * * $CODE_DIR/./asx_prev_day_cronjob.sh
00 21 * * * $EDGAR_CODE_DIR/./update_edgar.sh # main edgar cronjob
00 0 * * * $EDGAR_CODE_DIR/./update_forms_345_tables.sh
00 6 * * 5 $ABN_LOOKUP_DIR/./abn_lookup_cronjob.sh
00 3 * * 4 $ASIC_DIR/./asic_bulk_extract_cronjob.sh
I also had to make a few minor tweaks to some of the programs and files used by update_edgar.sh
, I'll commit these shortly.
@iangow I am going to close this for now, as the cronjob has been working well. Perhaps we could make a follow on issue to detail anything we should add to the cronjob bash script.
@iangow This issue is for dealing with the issue of making a cronjob using the script that updates the fundamental tables (
filings
,accession_nos
,filings_docs
and so on),update_edgar.sh
, as we discussed previously.