7yl4r / extracted_sat_ts_gom_csv_data

0 stars 0 forks source link

ensure csv files are updating #2

Open 7yl4r opened 1 year ago

7yl4r commented 1 year ago

file should be updating nightly but the last automated commit was a week ago. Need to check the logs to find out why the cronjob isn't working.

7yl4r commented 1 year ago

Methods have been updated. Files should update in github tomorrow at 8:00UTC

dotis commented 1 year ago

This seems to be working. My extracted files are being picked up and put on GH.

dotis commented 1 year ago

I am re-opening this, as new .csv files are not getting to GH. @7yl4r did a manual push last week, but there have been no updates to the .csv files in the data directory since then.

7yl4r commented 1 year ago

~dotis/upload_csv_files_to_gh.log is empty, meaning something went wrong with the cronjob /bin/bash /srv/imars-objects/tpa_pgs/rois/gom/extracted_sat_ts_gom_csv_data/upload_files.sh > upload_csv_files_to_gh.log.

After running it manually with the log file part to see what happens it worked again but I got output in the console instead of the logfile:

Enumerating objects: 2826, done.
Counting objects: 100% (2826/2826), done.
Delta compression using up to 40 threads
Compressing objects: 100% (2691/2691), done.
Writing objects: 100% (2691/2691), 44.66 MiB | 3.44 MiB/s, done.
Total 2691 (delta 1859), reused 0 (delta 0)
remote: Resolving deltas: 100% (1859/1859), completed with 83 local objects.
To github.com:7yl4r/extracted_sat_ts_gom_csv_data.git
   f744b78..8afc830  main -> main

I updated the command to use 2>&1 instead of > and found a new message in the .log:

[main 8afc830] auto-upload csv files
 Committer: Dan Otis <dotis@manglilloo.marine.usf.edu>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly:

    git config --global user.name "Your Name"
    git config --global user.email you@example.com

After doing this, you may fix the identity used for this commit with:

    git commit --amend --reset-author

 894 files changed, 19230 insertions(+), 16038 deletions(-)

Alternative approach: Trying to run with script:

manglilloo:~> script -c "/bin/bash /srv/imars-objects/tpa_pgs/rois/gom/extracted_sat_ts_gom_csv_data/upload_files.sh" upload_csv_files_to_gh.log

Gives an error with known_host key mismatch for github.com. I cleared the github.com entry from ~dotis/.ssh/known_hosts. Now I am not getting the error. I updated the cronjob to use the script syntax above.

We can check back next week to see if it has run properly.

dotis commented 1 year ago

cron push is still not working. However, @dotis did a manual push from manglilloo and it worked.

dotis commented 1 year ago

Question for @7yl4r - Once the .csv files are updated on GH, how long until Airflow will pick up the files and send to grafana?

7yl4r commented 1 year ago

Airflow should pick up the files daily. A new dag run can also be triggered from the GUI anytime.

I added a data-flow diagram in the readme including timings and data locations to help clarify how all the parts work.

dotis commented 1 year ago

The push to GH did not work over the weekend. I did a manual push and ran the DAGs.

dotis commented 1 year ago

This worked yesterday. @7yl4r can you check the crontab setting? Perhaps this is only set to run once a week in the crontab, although the data-flow diagram indicates a daily git commit.

7yl4r commented 1 year ago

crontab moved to 09:00 instead of 08:00

dotis commented 1 year ago

The crontab line in my manglilloo is: 0 9 * /srv/imars-objects/tpa_pgs/rois/gom/extracted_sat_ts_gom_csv_data/upload_files.sh 2>&1 upload_csv_files_to_gh.log

However, if I run this from the command line, I get a permission issue.

But when I run it like this from the /srv/imars-objects/tpa_pgs/rois/gom//extracted_sat_ts_gom_csv_data/ directory, it works: bash ./upload_files.sh

Perhaps we should add a "bash /." at the beginning of the crontab line? Just a thought.....

7yl4r commented 1 year ago

Added /bin/bash back. added export for the hostkeychecking env var. Let's see if that fixes it. If not I can try adding >> logfile.txt to each command inside the .sh script to see if we get any logs that way.

dotis commented 10 months ago

The data hasn't pushed for a few weeks now. I can't seem to do a manual update. I get an error on line 15 of upload_files.sh

./upload_files.sh: line 15: -a: command not found

This may all be moot if we use another way to get the csv files to grafana, like the Google data buckets.