/tmp/ratholeradio-archive (git)-[master] % datalad crawl
[INFO ] Loading pipeline definition from ./.datalad/crawl/pipelines/pipeline.py
[ERROR ] Failed to import pipeline from ./.datalad/crawl/pipelines/pipeline.py: No module named 'datalad.crawler' [pipeline.py:<module>:39] [pipeline.py:load_pipeline_from_module:403] (RuntimeError)
I tried a cheap
diff --git a/.datalad/crawl/pipelines/pipeline.py b/.datalad/crawl/pipelines/pipeline.py
index 5e0618a..f14bcef 100644
--- a/.datalad/crawl/pipelines/pipeline.py
+++ b/.datalad/crawl/pipelines/pipeline.py
@@ -36,10 +36,10 @@ import re
from os.path import join as opj, basename
from datalad.utils import updated
-from datalad.crawler.nodes.annex import Annexificator
-from datalad.crawler.nodes.crawl_url import crawl_url
-from datalad.crawler.nodes.misc import sub
-from datalad.crawler.nodes.matches import a_href_match, css_match
+from datalad_crawler.nodes.annex import Annexificator
+from datalad_crawler.nodes.crawl_url import crawl_url
+from datalad_crawler.nodes.misc import sub
+from datalad_crawler.nodes.matches import a_href_match, css_match
from logging import getLogger
lgr = getLogger('datalad.custom.ratholeradio')
but that only leads to
% datalad crawl
[INFO ] Loading pipeline definition from ./.datalad/crawl/pipelines/pipeline.py
[INFO ] Creating a pipeline for the ratholeradio.org podcasts
[INFO ] Running pipeline [[<datalad_crawler.nodes.crawl_url.crawl_url object at 0x7f8831b96f60>, a_href_match(query=<<'.*/(?P<year>2[0-9]{3}...>>), <datalad_crawler.nodes.crawl_url.crawl_url object at 0x7f8831b96f98>, [sub(ok_missing=False, subs=<<{'response': {'</?stro...>>), css_match(query='div#page .entry'), css_match(query='div#page .entry'), <function process_episode at 0x7f883d05b620>, <datalad_crawler.nodes.annex.Annexificator object at 0x7f884044ee10>]], <bound method Annexificator.finalize of <datalad_crawler.nodes.annex.Annexificator object at 0x7f884044ee10>>]
[INFO ] Fetching 'http://ratholeradio.org'
[WARNING] Failed to open cookies DB /home/mih/.config/datalad/cookies: db type could not be determined [__init__.py:open:88]
[WARNING] Failed to check for having a cookie for http://ratholeradio.org: argument of type 'NoneType' is not iterable [cookies.py:__contains__:85]
[ERROR ] 'function' object is not iterable [pipeline.py:xrun_pipeline_steps:270] (TypeError)
Yeah, there was some code breakage since then and cookies db not usable across python releases is known issue
But I don't think there is any new episode yet
The pipeline doesn't work anymore:
I tried a cheap
but that only leads to