voussoir / timesearch

The subreddit archiver
BSD 3-Clause "New" or "Revised" License
172 stars 7 forks source link

get_submissions doesn't work for some subreddits #15

Closed DoaJCBlogger closed 1 year ago

DoaJCBlogger commented 1 year ago

I'm trying to run get_submissions for /r/insaneparents and it doesn't download anything past September 7, 2022 but it works for several other subreddits such as /r/UnresolvedMysteries. The last Unix timestamp in the submissions table for /r/insaneparents is 1662563491. I was able to get all the comments using get_comments. This is the error I get from the latest version.

python.exe timesearch.py get_submissions -r insaneparents
Thank you Jason Baumgartner of Pushshift.io!
Traceback (most recent call last):
  File "C:\Users\777\Downloads\timesearch-master2\timesearch.py", line 554, in <module>
    raise SystemExit(main(sys.argv[1:]))
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\vlogging.py", line 218, in wrapped
    return main(argv, *args, **kwargs)
  File "C:\Users\777\Downloads\timesearch-master2\timesearch.py", line 546, in main
    return betterhelp.go(parser, argv)
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\betterhelp.py", line 620, in go
    return _go_multi(parser, argv, args_postprocessor=args_postprocessor)
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\betterhelp.py", line 616, in _go_multi
    return main(argv)
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\betterhelp.py", line 578, in main
    return args.func(args)
  File "C:\Users\777\Downloads\timesearch-master2\timesearch.py", line 56, in get_submissions_gateway
    get_submissions.get_submissions_argparse(args)
  File "C:\Users\777\Downloads\timesearch-master2\timesearch_modules\get_submissions.py", line 96, in get_submissions_argparse
    return get_submissions(
  File "C:\Users\777\Downloads\timesearch-master2\timesearch_modules\get_submissions.py", line 75, in get_submissions
    step = database.insert(chunk)
  File "C:\Users\777\Downloads\timesearch-master2\timesearch_modules\tsdb.py", line 347, in insert
    status = method(obj)
  File "C:\Users\777\Downloads\timesearch-master2\timesearch_modules\tsdb.py", line 427, in insert_submission
    (qmarks, bindings) = sqlhelpers.insert_filler(postdata)
TypeError: insert_filler() missing 1 required positional argument: 'values'

This is the error I get from a slightly older version.

python.exe timesearch.py get_submissions -r insaneparents
Thank you Jason Baumgartner of Pushshift.io!
Traceback (most recent call last):
  File "C:\Users\777\Downloads\timesearch-master\timesearch.py", line 553, in <module>
    raise SystemExit(main(sys.argv[1:]))
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\vlogging.py", line 218, in wrapped
    return main(argv, *args, **kwargs)
  File "C:\Users\777\Downloads\timesearch-master\timesearch.py", line 545, in main
    return betterhelp.go(parser, argv)
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\betterhelp.py", line 620, in go
    return _go_multi(parser, argv, args_postprocessor=args_postprocessor)
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\betterhelp.py", line 616, in _go_multi
    return main(argv)
  File "C:\Users\777\AppData\Roaming\Python\Python39\site-packages\voussoirkit\betterhelp.py", line 578, in main
    return args.func(args)
  File "C:\Users\777\Downloads\timesearch-master\timesearch.py", line 56, in get_submissions_gateway
    get_submissions.get_submissions_argparse(args)
  File "C:\Users\777\Downloads\timesearch-master\timesearch_modules\get_submissions.py", line 96, in get_submissions_argparse
    return get_submissions(
  File "C:\Users\777\Downloads\timesearch-master\timesearch_modules\get_submissions.py", line 75, in get_submissions
    step = database.insert(chunk)
  File "C:\Users\777\Downloads\timesearch-master\timesearch_modules\tsdb.py", line 347, in insert
    status = method(obj)
  File "C:\Users\777\Downloads\timesearch-master\timesearch_modules\tsdb.py", line 398, in insert_submission
    url = submission.url
AttributeError: 'DummySubmission' object has no attribute 'url'
DoaJCBlogger commented 1 year ago

I made some changes and it seems to work now. I'll run it overnight and post a Git patch tomorrow if it still works.

voussoir commented 1 year ago

Hi, could you try python -m pip install --upgrade voussoirkit? That should stop the sqlhelpers error. I did make changes to those functions. I know it's not very professional of me to have broken compatibility like that, sorry.

As for the url error, I think I fixed that in c8c160e00e600b5ee84c26044cbe6855eb96cacf in response to #13.

Thanks

DoaJCBlogger commented 1 year ago

It looks like upgrading voussoirkit fixed it. Thank you.

voussoir commented 1 year ago

Cool, sorry for the inconvenience.