calgo-lab / green-db

The monorepo that powers the GreenDB.
https://calgo-lab.github.io/green-db/
22 stars 2 forks source link

Incorporate source, country, gender & age information and change color & size to arrays #79

Closed BigDatalex closed 2 years ago

BigDatalex commented 2 years ago

This PR adds the following information to the green-db table and scraping table:

In addition the following fields of the green-db table are changed:

Apart from this I did some filename end method refactorings to include the country information in its name.

se-jaeger commented 2 years ago

Just want to mention that we don't miss it: If #80 is merged, we should update this one accordingly. 👍🏼

Looking forward, pretty nice improvements!

BigDatalex commented 2 years ago

Just want to mention that we don't miss it: If #80 is merged, we should update this one accordingly. 👍🏼

Looking forward, pretty nice improvements!

Thanks a lot for your commits too! I will update this according to #80 and test the spiders. I ping you if we are ready to try the notebook 👍

se-jaeger commented 2 years ago

Maybe check also the documentation and add the naming schema of the files and names of the spiders.

BigDatalex commented 2 years ago

From my POV we are now ready to test the notebook - the spiders were running fine with the latest changes! 👍

BigDatalex commented 2 years ago

LGTM, let's merge when lining is fixed. 👍🏼

Awesome! @se-jaeger the mypy linting is on yours, right?

en-GB commented 2 years ago

i did say i was looking into it. unfortunately github doesnt update the conversation unless you reload the page so i was replying to a 2hr old comment..

BigDatalex commented 2 years ago

I documented the changes from @en-GB here: https://github.com/calgo-lab/green-db/issues/74#issuecomment-1177592419 and reverted the commits in this PR. @se-jaeger can you please approve the changes once again, so that we can finally merge this one :sweat_smile: