frozenpandaman / s3s

Successor to splatnet2statink. Takes battle data from the SplatNet 3 app and uploads it to stat.ink!
https://github.com/frozenpandaman/s3s/wiki
GNU General Public License v3.0
403 stars 72 forks source link

Occasionally exceptions raise running from crontab #97

Closed Ogekuri closed 1 year ago

Ogekuri commented 1 year ago

I'm still experience occasionally exception when I run s3s from crontab:

Traceback (most recent call last):
  File "s3s.py", line 1889, in <module>
    main()
  File "s3s.py", line 1883, in main
    fetch_and_upload_single_result(hash, noun, blackout, test_run) # not monitoring mode
  File "s3s.py", line 1302, in fetch_and_upload_single_result
    post_result(result, False, isblackout, istestrun) # not monitoring mode
  File "s3s.py", line 1114, in post_result
    payload = prepare_job_result(results[i]["data"], ismonitoring, isblackout, overview_data, prevresult=prevresult)
  File "s3s.py", line 878, in prepare_job_result
    prev_job = json.loads(prev_job_post.text)
  File "/usr/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

It's start from fetch_and_upload_single_result after the result_post = requests.post(utils.GRAPHQL_URL that fail with result_post.status_code == 401.

This cause a fail on json.loads(result_post.text) on the fetch_and_upload_single_result, but also the json.loads(prev_job_post.text) inside the prepare_job_result, that raise exceptions.

I don't where is better to fix it, if on fetch_and_upload_single_result or on prepare_job_result.

Could you try to check it?

Many thanks!

frozenpandaman commented 1 year ago

Thanks! I'm not sure why that request would fail, but if it does, I made it so it'll just pass silently (i.e. no data on previous job rank/points will be sent). Let me know if the changes in 368e519 fix this for you.

Ogekuri commented 1 year ago

Better, but I think that we have to wrap with a try/catch also the json.loads(result_post.text) on fetch_and_upload_single_result on line 1303.

Here the error that still occur:

Traceback (most recent call last):
  File "s3s.py", line 1888, in <module>
    main()
  File "s3s.py", line 1882, in main
    fetch_and_upload_single_result(hash, noun, blackout, test_run) # not monitoring mode
  File "s3s.py", line 1303, in fetch_and_upload_single_result
    result = json.loads(result_post.text)
  File "/usr/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Many thanks!

ADeeeee commented 1 year ago

I have a cronjob which runs in every hours and it seems to happen in every 3 hours (12, 15, 18, 21 o'clock today).

Maybe we could do some retries if the GraphQL api returns unexpected things with 401 in order to reduce this error.

ADeeeee commented 1 year ago

Okay, it's been about a while since 10th Dec. I think this is enough for the result this moment.

I only run the job hourly from 10:13-18:13 on workdays and 02:13 to 18:13 on weekend. The reason why I run it at minute 13 is just trying to avoid the most usual of the trigger time like minute 00, 10, 15, 20...etc in order to reduce the error from SplatNet3.

It will also restart the s3s.py once after it failed for the first time (error raised) in 3 minutes. e.g.: python3 s3s.py -r || (sleep 180 && python3 s3s.py)

And here is how I simply record it:

--- a/s3s.py
+++ b/s3s.py
@@ -1345,9 +1345,13 @@ def fetch_and_upload_single_result(hash, noun, isblackout, istestrun):
                        data=utils.gen_graphql_body(utils.translate_rid[dict_key], dict_key2, hash),
                        headers=headbutt(forcelang=lang),
                        cookies=dict(_gtoken=GTOKEN))
-       result = json.loads(result_post.text)
-       post_result(result, False, isblackout, istestrun) # not monitoring mode
-
+       try:
+               result = json.loads(result_post.text)
+               post_result(result, False, isblackout, istestrun) # not monitoring mode
+       except json.decoder.JSONDecodeError as e:
+               date = os.popen("echo $(date)").read().replace("\n", "")
+               os.popen(f"echo '{date} - {result_post}' >> debug.log")
+               raise(e)

NOTE: All timestamp are transformed in UTC+0 here.

The failures are below:

Mon Jan 23 14:13:20 UTC 2023 - <Response [401]>
Tue Jan 24 14:13:19 UTC 2023 - <Response [401]>
Wed Jan 25 14:13:23 UTC 2023 - <Response [401]>
Thu Jan 26 14:13:19 UTC 2023 - <Response [401]>
Fri Jan 27 14:13:26 UTC 2023 - <Response [401]>
Sat Jan 28 09:13:18 UTC 2023 - <Response [401]>
Sun Jan 29 04:13:22 UTC 2023 - <Response [401]>
Sun Jan 29 14:13:22 UTC 2023 - <Response [401]>
Wed Feb 1 14:13:29 UTC 2023 - <Response [401]>

Conclusion in short:

  1. It only failed once and successed for the retry. (It would be xx:16 if it's a retry)
  2. It usually failed at 14, UTC+0.
  3. It seems a lot better than this issue. It failed quite often before December in my memory.
  4. Response 401

I'd still recomand to have a retry for {result_post} if it's not in json format in this case instead of running to whole script again.

cesaregarza commented 1 year ago

I suspect it’s caused by the GTOKEN expiring by the time the request is made, which would explain the 401 response. Regenerating on a not 200 might work, but I’ll need to do some testing to see if this is the cause.

frozenpandaman commented 1 year ago

See #101.