Code assumes source.url will always be unique (in reality, it should). But, if it is not, then the current lookup when writing results will get the first match and use that source_id and publisher_id, which could be wrong.
This was a bug I raised before many other people touched the code. I guess it is still relevant, but depends on the changes made via #4 - so please check.
Code assumes source.url will always be unique (in reality, it should). But, if it is not, then the current lookup when writing results will get the first match and use that source_id and publisher_id, which could be wrong.