Airflow workflows:
Please write your name next to the task, so that people would not overlap with the same issue:
[x] 1. Not all links in documentation work properly
[ ] 2. Some tests have extensive "expected" values, it would be good to move them to another file
[x] 3. Align harvesting DAGs schedules with production (aps.py, elsevier_pull_ftp.py, hindawi.py, dag_pull_ftp.py, oup_pull_ftp.py, dag_pull_ftp.py)
[x] 4. Change the names of harvesting DAGs and the file names they are in, that would reflect their source and publisher correctly: for example, IOP DAG is dag_pull_ftp, should be renamed to iop_pull_sftp.
IOP, Springer, Elsevier - SFTP
OUP - FTP
Hindawi, APS - API
[ ] 5. Some value in parsers are just put in the arrays like this: extra_function=lambda x: [int(x)]. It makes more sense to do this in generic parsing because there we are forming the correct data structures. It will be quite a big task because changes have to be reflected to in tests as well
[x] 6. Remove types: somewhere they are used, somewhere not
[ ] 7. Make code more Pythonic: we have Java-style code (interfaces). Rewrite them to abstract classes, maybe even with concrete implementation in order to avoid boilerplate code (for example the methods get_by_id, delete_all from IRepository have the same implementation in all publishers' Repository classes )
Airflow workflows: Please write your name next to the task, so that people would not overlap with the same issue: