FaIRv2.0 - Githubissues

znicholls commented 3 years ago

This issue is intended to be a place to discuss how we deal with FaIR v2.0. I have a few questions about how we want to do this in practice, which shouldn't hold up @JGBroadbent's work but would be good to sort out sooner rather than later.

Should FaIRv2.0 be in the same repository? My suggestion is yes, but that is predicated on the assumption that active development of FaIR1.X stops once 2.0 is sorted (we can still do bug fixes but no major improvements). If that assumption fails, I think we should start a new repository because trying to manage two 'masters' at once will be a nightmare.

If the answer to the above question is no, then things become relatively simple and the rest of my thoughts can be ignored. In case the answer is yes, I will continue.

My suggestion would be that we use the FAIR-v2.0.0 branch as a quasi-master for v2.0 until active development of FaIR 1.X stops. It'll be a touch fiddly but should be fine in the end.
For FaIRv2.0, my suggestion would be that we basically start again on the devops side (continuous integration, docs, releasing etc.), making everything a bit cleaner and more automated. This will be a bit of a learning curve, but a worthwhile one I think. I think it also makes sense because we're doing a complete re-write of the codebase, so may as well do the stuff around it while we're at it.
When the time comes, this means we'll have a commit which removes all the FaIR 1.X stuff and replaces it with whatever state FaIR2.0 is in at that point. The history will look a bit odd, but I think that's the simplest way to do it.

Having said all that, the real question is what does everyone else think? e.g. @chrisroadmap @njleach

njleach commented 3 years ago

First question: my thought is that it should be in the same repo (and this is what's currently stated in the update paper). I do completely agree with you on the two "masters" issue - it would be super confusing for any users.

Second questions:

I completely agree.
Strong agree.
Yes.

You've pretty much described how I imagined the switchover would go so I don't really have anything else to add!

One related (but not urgent) question: where should the Excel version of the model go? I am keen on having a repo for it as well such that a centralised version exists that is (hopefully) correct; rather than having loads of slightly different versions floating around (as has happened in Oxford when I've given people copies of the model at different points...) If having it as a branch to this repo would work I'd be happy with that - but if not I'm happy to create a new repo (could call it EFaIR...?).

znicholls commented 3 years ago

Great! Keen to hear what @chrisroadmap thinks too.

where should the Excel version of the model go?

My instinct would be in this repo too. Then we just need to find an automated way of checking that it is still somewhat correct. That will require a bit of head scratching, but I'm sure we'll find a way.

njleach commented 3 years ago

Awesome - and yep for sure. I'll do a bit of research on testing excel files. I've now got a working version that is more or less identical to the python version (the timestepping is very slightly different because using what I did in python is tricky in Excel and makes the whole thing way less readable so they aren't bit-identical but they given pretty much the same results).

chrisroadmap commented 3 years ago

sorry slow - spinning plates with 1.6 for AR6

I did create a new empty repo at https://github.com/OMS-NetZero/FAIR2 which I set up with some skeleton linting and testing which could be a useful outline. It looks like a lot of work has been done on the branch on this repo so we can delete that other one if it won't be used.
Yes - and see 1. above.
Yes fine, breaking history shouldn't be a big problem right as we're going from 1 to 2 major version and all historical versions will still be available on pypi 3.1. corollary: I guess I'll be the only person interested in periodically fixing 1.X series

Excel version could go in this repo as long as we exclude it from pypi installs? My slight preference would be to put it into a different repo and provide a health warning that it's not likely to be as rigorously tested as the python version.

znicholls commented 3 years ago

3. 3.1. corollary: I guess I'll be the only person interested in periodically fixing 1.X series

That would be my guess. It'll be a bit awkward for you as I think we'll have to either a) do it in a branch in this repo (running into the two master problem a bit but if it's only bug fixes maybe not so bad) or b) split out a FaIR1.X repo.

Excel version could go in this repo as long as we exclude it from pypi installs?

Yep, I think if we put it in a folder called e.g. excel-implementation then we can easily keep it all separate.

I'll do a bit of research on testing excel files

Maybe xlwings would give us a way to automate extracting values from an excel workbook? If we can get the values out, we can then write all the tests in Python which would make our life easy (and allow us to test fairly rigorously).

njleach commented 3 years ago

Sounds good to me.

On xlwings, I thought that looked like the best (only?) option for automating checks of the Excel version. I've started to have a go and it seems like it could be used (in a limited capacity). As far as I can tell, xlwings could be used to extract values from a (static) workbook. So as far as I can see, we could definitely run automated testing to compare the output of a single (pre-defined) scenario between the Excel (default configuration) and python versions to ensure they are similar. What I currently can't see a way to do (outside of pretty laborious manual testing within Excel) would be testing over a range of scenarios or configurations to see if it breaks.

chrisroadmap commented 3 years ago

That would be my guess. It'll be a bit awkward for you as I think we'll have to either a) do it in a branch in this repo (running into the two master problem a bit but if it's only bug fixes maybe not so bad) or b) split out a FaIR1.X repo.

Very happy to retire FaIR 1.x once this is up and running and tested to verify similar behaviour. I'm keeping it going now to fall back on (and I haven't learned FaIR 2 yet)

znicholls commented 3 years ago

xlwings could be used to extract values from a (static) workbook

Damn I was hoping it could be made dynamic... Alright in that case then I think we have to just test that one pre-defined scenario and configuration, and put the warning that everything else is untested. I don't see how we can do much better...

JGBroadbent commented 3 years ago

Hi All,

Just to let you know that a working (i.e. it passes my tests on my machine) version of FaIR 2.0.0 (minus the OpenSCM runner or any built in parameter sets), is now uploaded on my fork of the FAIR-v2.0.0 repo: https://github.com/JGBroadbent/FAIR/tree/FAIR-v2.0.0/fair/2.0.0 .

Let me know thoughts etc + what you'd like next in terms of functionality (beyond an OpenSCM runner and some cleaning up of my code, which are the next things on my agenda) or presets etc.

rgieseke commented 3 years ago

Damn I was hoping it could be made dynamic... Alright in that case then I think we have to just test that one pre-defined scenario and configuration, and put the warning that everything else is untested. I don't see how we can do much better...

It seems openpyxl (https://openpyxl.readthedocs.io/en/stable/index.html) is another alternative for creating/reading XSLX files. (Xlwings seems to be Mac/Windows only.)

On the testing side of different scenarios I imagine one could use a headless LibreOffice Calc version to check different scenarios within GitHub Actions or maybe use the Google Sheets API to upload or check files with different scenarios.

It seems Microsoft's Excel Viewer is deprecated (https://docs.microsoft.com/en-us/office/troubleshoot/excel/get-latest-excel-viewer), and it seems it doesn't calculate formulas anyhow (https://groups.google.com/forum/#!topic/spreadsheet-writeexcel/V8KRXnSQ58E).

njleach commented 3 years ago

Thanks @rgieseke! This is really helpful - I'll have a look into these options. I'm a little concerned that some of the functions I've used to try and make the workbook more dynamic are going to make it incompatible with Calc / Sheets (I've tried opening it in in Calc and it doesn't work, with enough bits going wrong that it would be a lot of effort to make it compatible; though if this is crucial I possibly could make it work without the functions that break compatibility).

I've also realised I've hijacked this thread a little to discuss the Excel version - shall I open a separate issue to discuss this?

rgieseke commented 3 years ago

As i understood one of the ideas was to have a model that could fairly easy be implemented in different languages/environments. In case of these major differences already, LibreOffice/Google Sheets would be different environments then, I guess ...

To test the Excel version programmatically, maybe it's possible to interact with Office 365 or install an Office demo version in a Windows-based GitHub Action ...

OMS-NetZero / FAIR

FaIRv2.0 #85