nestauk / industrial_taxonomy

Refactor of nestauk/industrial-taxonomy which upon completion will replace it.
MIT License
3 stars 0 forks source link

15 Fuzzy match Glass <-> Companies House #24

Closed bishax closed 2 years ago

bishax commented 2 years ago

Closes #15

Note to reviewers: run make conda-update to update environment.


Checklist:

bishax commented 2 years ago
* I couldn't quite follow the `JacchammerFlow` but I am not sure that the script is the right place to explain. Perhaps direct the user to the methodology in the paper for additional information?

It doesn't expose much of the methodology, more the plumbing which is a little complicated because

Nevertheless I think the core matching step is suffering from a lack of helpful comments

bishax commented 2 years ago

2021-12-03 13:41:09.346 [2313/start/14504 (pid 1839)] botocore.errorfactory.ClientException: An error occurred (ClientException) when calling the SubmitJob operation: Error executing request, Exception : Job name should match valid pattern, RequestId: 52991157-ea5c-41d5-8623-4d2013332ec5 2021-12-03 13:41:09.482 [2313/start/14504 (pid 1839)] Task failed. 2021-12-03 13:41:09.618 This failed task will not be retried. Internal error: The end step was not successful by the end of flow.

@georgerichardson thank god for Google and purple links...

I was thinking oh god what is this problem going to be - turns out it's a known one that happens when there are spaces/dots in usernames:

GH issue: https://github.com/Netflix/metaflow/issues/100

Solution in slack: https://data-analytic-nesta.slack.com/archives/C02JHEEEG5U/p1635855884012200

georgerichardson commented 2 years ago

Works fine now with export METAFLOW_USER in .env

Juan-Mateos commented 2 years ago

Good to merge!