Urban-Analytics-Technology-Platform / demoland-project

Developing a modelling system to quantify features of land use in urban environments, UK based
https://urban-analytics-technology-platform.github.io/demoland-project/
MIT License
11 stars 2 forks source link

Repository organisation #9

Closed martinfleis closed 1 year ago

martinfleis commented 1 year ago

Hey @ciupava, can we do some house cleaning in the repository? I've noticed that we have a bit of mess here that should not make its way to git, like .DS_store files, __pycache__ etc. I've also noticed .gitignore outside of root...

I am happy to clean that and set .gitignore in a a proper way filtering it automatically but wanted to check with you first.

One more thing - do you have any structure of the python folder in mind? how to organise notebook as they come, where to put which bits of code etc? I'd like to do some work on the data and am not sure how to structure to make it work within the mental model you have here.

I assume to work on a fork, at least in the beginning to avoid conflicts but if we already know how it'll look in the end, that would be superb.

ciupava commented 1 year ago

great @martinfleis, any 'technical' improvement as in 'best practices' is always welcome from my side, so please go ahead with it (cleaning + .gitignore). regarding the python folder atm it was more of a 'dumping' stuff I have done while it comes, so +1 for thinking of giving it a structure. this would partly build in progress, but if we can think of something to start with, and get organised around it, that sounds good. regarding forking/branching and such, happy to hear from you about best practices, like above, so let me know.

martinfleis commented 1 year ago

I've merged the PR with the cleanup (#10), so you may need to update your local branch from main. There are gitignore templates for each language ensuring that all those local cache files are not uploaded as they are not useful on git. I've added one for Python and other for R, plus the macOS DS_Store.

Unless we work on the same files, it is fine if we just push to main but it can get messy, so I now have my fork and will merge stuff via PRs into the main. It is also better in terms of visibility of changes and a review process. Direct push to main gets often unnoticed and there's no way to review the stuff.

ciupava commented 1 year ago

thanks for this Martin. then before I commit+push again from my local branch I shall update... I need to review how to do that, haven't done it for ages, so if you have any suggestion it'd be welcome.

martinfleis commented 1 year ago
git fetch
git pull

if using terminal. It should hopefully not cause any conflicts.

What tool are you using to deal with git? Terminal or via VS Code, Gitub Desktop...?

ciupava commented 1 year ago

terminal commands (from VScode, where I code)

gmingas commented 1 year ago

Hi both, I work in the Turing REG team and will be involved in the UA work from April. @ciupava and I had a chat about the repo structure yesterday and I had a look.

I agree with everything @martinfleis said. Regarding the structure, I would probably try to have all the core functionality under one folder named src or land_use_demonstrator, with an internal hierarchy. I would also try to keep the core code in .py and .R files, rather than notebooks, although I understand notebooks might be more appropriate/convenient for early exploration. I would eventually use notebooks for examples of how to use the code and put them in a separate folder called examples or similar.

ciupava commented 1 year ago

great input, thank you @gmingas. yes, notebooks are atm being used for the exploratory phase, good hint on creating a separate folder with examples later.

martinfleis commented 1 year ago

I would also try to keep the core code in .py and .R files, rather than notebooks, although I understand notebooks might be more appropriate/convenient for early exploration.

I partially disagree here. The research code shall be reproducible in an approachable way so we should have the top level code written in documented notebooks. When there are more complex functions to be coded those shall go to .py files and get imported but you should be able to follow the scientific workflow in a rendered "book", not in .py files.

gmingas commented 1 year ago

@martinfleis I think we are possibly describing the same thing but with different phrasing.

I totally agree with having top level notebooks to demonstrate the workflow - and make them approachable and documented well. I wouldn't define core classes or functions in these notebooks though, unless there is a good reason to. There might be cases where we want to define some functions/classes in notebooks e.g. when the user needs to be able to define a class and add their own application-specific code to it and possibly other scenarios. But also, it is really early and I suspect we will come back to this as the project matures to decide what makes more sense.

martinfleis commented 1 year ago

@gmingas Ah, yes. Fully agree on that :).