iangow / se_core

Core code for StreetEvents data
7 stars 5 forks source link

Fix CRONJOB #9

Open bdcallen opened 4 years ago

bdcallen commented 4 years ago

@iangow This issue is for discussing the issue with the CRONJOB, particularly the part which updates the calls table, as discussed in the meeting.

iangow commented 4 years ago

So I think the first think to do is to stop this cron job (probably editing using crontab -e and commenting the line for this one).

iangow commented 4 years ago

I just did crontab -e and edited it so this cron job doesn't run.

iangow commented 4 years ago

When I run crontab -e, I see a file that has the following three lines as the last lines of the file. What do you see, @bdcallen ?

0 14 *  * * /etc/cron.daily/asx_prev_day_cronjob
# 0 14 *  * * /etc/cron.daily/se_core_rsync
# 05 14 *  * * /etc/cron.daily/se_core_update
iangow commented 4 years ago

@Yvonne-Han We're looking into this issue. But for now streetevents.calls seems to be good.

bdcallen commented 4 years ago

@iangow after I do

bdcallen@igow-z640:~$ crontab -e

this is what I see at the moment

  GNU nano 2.9.3                                                                                                               /tmp/crontab.g2K11Z/crontab                                                                                                                         

# Edit this file to introduce tasks to be run by cron.
# 
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
# 
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').# 
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
# 
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
# 
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
# 
# For more information see the manual pages of crontab(5) and cron(8)
# 
# m h  dom mon dow   command

33 14 * * * /etc/cron.daily/./asx_prev_day_cronjob.sh
iangow commented 4 years ago

OK. So it seems that the cron jobs are user-specific. I am not sure how these run. Perhaps you could add to your crontab the two that you want to add and see if they run fine.

bdcallen commented 4 years ago

@iangow I just tested cronjob by changing my crontab to

35 15 * * * /etc/cron.daily/./asx_prev_day_cronjob

before 3:35pm, and it is successfully running as expected

crsp=> SELECT COUNT(*) FROM asxlisting.issuer_id_validity;
 count 
-------
  9573
(1 row)

crsp=> SELECT COUNT(*) FROM asxlisting.issuer_id_validity;
 count 
-------
  9773
(1 row)

crsp=> SELECT COUNT(*) FROM asxlisting.issuer_id_validity;
 count 
-------
 12273
(1 row)
bdcallen commented 4 years ago

@iangow So I've been doing some reading about cron, crontab and run-parts, and by reading this and then looking at the main crontab (the one for whole system, not ours) which is at /etc/crontab, I found that the main crontab is this

17 *    * * *   root    cd / && run-parts --report /etc/cron.hourly
25 6    * * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6    * * 7   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6    1 * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )

What this does is that it uses run-parts to run all cronjobs that have beenput in etc/cron.hourly, etc/cron.daily, etc/cron.weekly and etc/cron.monthly at the specified times. So all the cronjobs, including those for asxlisting and streetevents are run at 6:25am daily by the main cronjob. Furthermore, /var/spool/cron/crontabs shows all crontabs that a run by cron, and indeed a glance at that shows that there are currently three crontabs, one named as your username, another named as my username, and then one called root, which is the main one.

So what this implies is that you may actually need to delete the streetevents executables from cron.dailyfor now to stop the streetevents cronjob entirely.

bdcallen commented 4 years ago

@iangow This also means the first source I read about cron was wrong, and that adding the executables to the cron.daily was sufficient. We didn't have to make a crontab at all, and essentially we've been doing some daily jobs twice a day. This raises the question though, would we prefer to use the main cronjob by adding executables to cron.daily, or by using a crontab?

bdcallen commented 4 years ago

@iangow I just made a simple bash script test.sh, which was simply

#!/bin/bash
echo 'Hello World!' > /home/bdcallen/abn_lookup/helloworld.txt

I then made it executable by doing

bdcallen@igow-z640:~$ chmod u+x abn_lookup_cronjob.sh

I then tested this in my cronjob, by doing crontab -e and adding the line

07 19 * * * /home/bdcallen/abn_lookup/./test.sh

a minute before 7:19pm, then watched what happened in the file manager in /home/bdcallen/abn_lookup, and as 7:19pm ticked over, helloworld.txt appeared. So basically we can do something similar with the other bash scripts to set up cronjobs in crontab, if we want to do it this way.

iangow commented 4 years ago

This raises the question though, would we prefer to use the main cronjob by adding executables to cron.daily, or by using a crontab?

Whatever is easiest to maintain and effective.

bdcallen commented 4 years ago

@iangow Confirmation that the cronjob is indeed running twice. This was before 2pm today

crsp=> select description from pg_description
join pg_class on pg_description.objoid = pg_class.oid
where relname = 'issuer_id_validity';
                                description                                
---------------------------------------------------------------------------
 Created using create_issuer_id_validity.py, ON 2019-11-29 00:20:05.160187
(1 row)

and this is what I got from the same command after 2:10pm, when the issuer_ids part of the cronjob had finished

crsp=> select description from pg_description
join pg_class on pg_description.objoid = pg_class.oid
where relname = 'issuer_id_validity';
                                description                                
---------------------------------------------------------------------------
 Created using create_issuer_id_validity.py, ON 2019-11-29 14:10:50.679740
(1 row)

So the time is indeed in Australian Eastern Standard Time, and it has been updating at midnight as well as 2pm.

bdcallen commented 4 years ago

@iangow I have a feeling I might have stopped the main crontab. I did this to try stop my cronjob, as it had not finished in the time I had expected and wasn't printing output to dead.letter

bdcallen@igow-z640:~$ ps -o pid,sess,cmd afx | egrep "( |/)cron( -f)?$"
 1022  1022 /usr/sbin/cron -f
bdcallen@igow-z640:~$ pkill -s 1022
pkill: killing pid 1022 failed: Operation not permitted
pkill: killing pid 28279 failed: Operation not permitted
bdcallen@igow-z640:~$ sudo pkill -s 1022
bdcallen@igow-z640:~$ ps -o pid,sess,cmd afx | egrep "( |/)cron( -f)?$"

The last line gave nothing. So I tried

bdcallen@igow-z640:~$ sudo cron
bdcallen@igow-z640:~$ ps -o pid,sess,cmd afx | egrep "( |/)cron( -f)?$"
28985 28985  \_ cron
bdcallen@igow-z640:~$ crontab -e
crontab: installing new crontab
bdcallen@igow-z640:~$ ps -o pid,sess,cmd afx | egrep "( |/)cron( -f)?$"
28985 28985  \_ cron

So it seens cron is running, but with a new PID, without /usr/sbin/cron and -f options set. Do you know what is going on?

By the way, after I did this, my cronjob ran successfully, with the line

26 17 * * * $CODE_DIR/./asx_prev_day_cronjob.sh

and the CODE_DIR set to my asxlisting and ASXLISTING_DIR set to the data directory of 6TB (both in the cronjob). I'll keep an eye on what happens at midnight tonight. Otherwise we'll discuss this tomorrow.

iangow commented 4 years ago

@bdcallen Where are we on this issue? Is the cron job still running twice each day?

bdcallen commented 4 years ago

@iangow Are you refering to the one for se_core or asxlisting? As I erroneously put a few posts on the latter here due to the similarity in the issue names, and me confusing them one day. But with regards to the latter, I have now deleted the bash script executable from etc/cron.daily

bdcallen@igow-z640:/etc/cron.daily$ ls
0anacron  apport  apt-compat  asx_prev_day_cronjob  bsdmainutils  cracklib-runtime  dpkg  logrotate  man-db  mlocate  passwd  popularity-contest  se_core_rsync  se_core_update  sysstat  ubuntu-advantage-tools  update-notifier-common  upstart
bdcallen@igow-z640:/etc/cron.daily$ rm asx_prev_day_cronjob
rm: remove write-protected regular file 'asx_prev_day_cronjob'? y
rm: cannot remove 'asx_prev_day_cronjob': Permission denied
bdcallen@igow-z640:/etc/cron.daily$ sudo rm asx_prev_day_cronjob
[sudo] password for bdcallen: 
bdcallen@igow-z640:/etc/cron.daily$ ls
0anacron  apport  apt-compat  bsdmainutils  cracklib-runtime  dpkg  logrotate  man-db  mlocate  passwd  popularity-contest  se_core_rsync  se_core_update  sysstat  ubuntu-advantage-tools  update-notifier-common  upstart

and looking at our crontabs

bdcallen@igow-z640:/etc/cron.daily$ sudo crontab -l -u igow 
# Edit this file to introduce tasks to be run by cron.
# 
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
# 
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').# 
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
# 
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
# 
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
# 
MAILTO="iandgow@fastmail.com"
# For more information see the manual pages of crontab(5) and cron(8)
# 
# m h  dom mon dow   command
# 0 14 *  * * /etc/cron.daily/asx_prev_day_cronjob
# 0 14 *  * * /etc/cron.daily/se_core_rsync
# 05 14 *  * * /etc/cron.daily/se_core_update
bdcallen@igow-z640:/etc/cron.daily$ crontab -l
# Edit this file to introduce tasks to be run by cron.
# 
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
# 
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').# 
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
# 
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
# 
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
# 
# For more information see the manual pages of crontab(5) and cron(8)
# 
# m h  dom mon dow   command

PATH=/home/bdcallen/anaconda3/bin:/home/bdcallen/perl5/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
PGHOST=10.101.13.99
PGDATABASE=crsp
PGUSER=bdcallen
PGPASSWORD=temp_20180308
CODE_DIR=/home/bdcallen/asxlisting
EDGAR_CODE_DIR=/home/bdcallen/edgar
ASXLISTING_DIR=/media/igow/6TB/data
ABN_LOOKUP_DIR=/home/bdcallen/abn_lookup
ASIC_DIR=/home/bdcallen/asic

26 17 * * * $CODE_DIR/./asx_prev_day_cronjob.sh
00 21 * * * $EDGAR_CODE_DIR/./update_edgar.sh
00 0 * * * $EDGAR_CODE_DIR/./update_forms_345_tables.sh
00 6 * * 5 $ABN_LOOKUP_DIR/./abn_lookup_cronjob.sh
00 3 * * 4 $ASIC_DIR/./asic_bulk_extract_cronjob.sh

You have hashed the asxlisting cronjob out, but I haven't, so now that cronjob should only run once a day.