Closed yulqen closed 3 years ago
Sounds great. Will have a look/play around with it. Would be good to fix up sometime with you to discuss. Will try and find sometime in our diaries. Cheers, Will.
Yeah - just something to consider. You might find they're in the groove with the current set up, but definitely worth considering. Let me know when you're available and I'd be happy to go through it.
@yulqen Check out main.py for the latest on where I've got to on putting code into a cml. First sprint completed and working. I'm going to re-read the real python argparse tutorial and see how I can refactor and improve. But, would also welcome your input. Would be great to discuss on tomorrow. Cheers.
Have pulled latest repo, created a virtualenv, installed from requirements.txt
and blithely done python main.py
and get ImportError
at this point. Will investigate...
Traceback (most recent call last):
File "/home/lemon/code/python/analysis_engine/main.py", line 27, in <module>
from data_mgmt.data import (
File "/home/lemon/code/python/analysis_engine/data_mgmt/data.py", line 9, in <module>
import matplotlib.pyplot as plt
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/__init__.py", line 1111, in <module>
rcParamsOrig = RcParams(rcParams.copy())
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/__init__.py", line 891, in __getitem__
from matplotlib import pyplot as plt
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/pyplot.py", line 32, in <module>
import matplotlib.colorbar
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/colorbar.py", line 40, in <module>
import matplotlib._constrained_layout as constrained_layout
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/_constrained_layout.py", line 52, in <module>
from matplotlib.legend import Legend
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/legend.py", line 43, in <module>
from matplotlib.offsetbox import HPacker, VPacker, TextArea, DrawingArea
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/offsetbox.py", line 33, in <module>
from matplotlib.image import BboxImage
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/image.py", line 19, in <module>
from matplotlib.backend_bases import FigureCanvasBase
File "/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 46, in <module>
from matplotlib import (
ImportError: cannot import name 'get_backend' from partially initialized module 'matplotlib' (most likely due to a circular import) (/home/lemon/code/python/analysis_engine/.venv/lib/python3.9/site-packages/matplotlib/__init__.py)
If you want to share what your intentions are for your CLI interface, I'm happy to help you pull it together, including doing the packaging for PyPI, etc.
EDIT2: I am making progress understanding what you're after here...
EDIT: I see your module comment at the top! (by the way, I think you mean "CLI" rather than "cml". CLI is commonly usaged to mean "command line interface" and that's what we're doing here.
Looks like you want vfm
to be your basic command and have -vfm
, -stage
, -group
, and -quarters
as your arguments. Tip: arguments in argparse are usually of the form stage
rather than -stage
(although there are environments where the single dash is used. Go programmers tend to use the single dash actually - not so much in python.). So, for example, you'd have vfm run stage
and vfm run quarters
. Is that right? If that is the case, you don't actually need run
. You would have, say, vfm stage
or whatever.
To be honest, I've not used argparse
for a couple of years so I can't remember. If I could get this import error fixed I could start to have a look. Something to do with matplotlib
?
Yes I think there is a problem with the matplotlib versioning. It works fine on my computer, but I think it causes problems on windows.
Thanks for all the comments. Will look at in detail tonight.
Upgraded to matplotlib v3.3.0 and it seems to be fine.
Great. Will try that version on my machine and see what happens.
3.3.0 works fine. There are a few, matplotlib related, areas of the code which now crash, but to be honest not entirely sure what they did and if I hash them out code seems to run fine.
I was aware of this versioning issue, so its good to tackle it now. Have updated the requirements.txt
btw the arguments would ideally be something like
analysis_engine vfm -quarters "Q2 20/21" "Q2 19/20" -stage "FBC"
or
analysis engine vfm -group "Rail"
much like datamaps has
datamaps import templates
or
datamaps export masters -d [PATH]
Gotcha. You don't have to adhere to it, but just to note that "options" on the command line are most commonly of the form --quarters
, --group
, --stage
. When you use the full word, you use --quarters
- the double dash. -q
would be the equivalent - it's the short form. The user can choose which to use. See in the docs here the add_argument()
function allows you to specify both forms.
Datamaps actually uses subcommands. In the example you give, datamaps
is the program. import
is the command. templates
is the subcommand. I don't think you'll need subcommands yet but you can do them with argparse if necessary.
In the case of datamaps export masters -d
- datamaps
is the program. export
is the command. masters
is the subcommand. -d
is the option, and --datamap
is also legal here.
Yes so I think it would be best to have it like this.
analysis engine
the program vfm
the command . And then options e.g. -q or --quarters
another example would be something like
analysis engine
the program project reports
the command. This would produce a different output to vfm.
Yes don't think I'll need sub commands.
Yep. project reports
would have to be project-reports
(dash rather than underscore). I don't think you can use multiple words.
I also recommend your program just be engine
or ae
rather than analysis_engine
which is a bit long. You could create an alias in bash to do the same but you can easily package your python code to do it.
Hi Matt, let me know if you might have sometime for a coding session today. I'm only available until 2pm today. Would be great to go through the above. If you can't this week no worries though. Cheers.
@yulqen We've come a long way on this issue! Thanks for all your help and guidance.
I uploaded onto PyPI early the analysis_engine package. I left a few messages on Teams for you (in case you didn't see them).
Now thinking about how to best install and operate the a_e package onto my team's computers, which are of course windows.
Do you think it's best for them to download another shell? Or stick with the basic cmd?
They all have Python already installed so it should be a case of checking the Python version and then pip installing via python3 -m pip install analysis_engine
and from there it will be analysis.exe
subcommand
. Is that right?
Congratulations! I installed it before I saw this message.
At some point soon you'll need to handle this exception, which happens whenever I try to call any command such as analysis risks
:
Traceback (most recent call last):
File "/tmp/toss/bin/analysis", line 8, in <module>
sys.exit(main())
File "/tmp/toss/lib/python3.9/site-packages/analysis_engine/main.py", line 208, in main
args.func(vars(args))
File "/tmp/toss/lib/python3.9/site-packages/analysis_engine/main.py", line 48, in vfm
m = Master(get_master_data(), get_project_information())
File "/tmp/toss/lib/python3.9/site-packages/analysis_engine/data.py", line 55, in get_master_data
project_data_from_master(root_path / "core_data/master_3_2020.xlsx", 3, 2020),
File "/tmp/toss/lib/python3.9/site-packages/datamaps/api/api.py", line 14, in project_data_from_master_api
m = Master(Quarter(quarter, year), master_file)
File "/tmp/toss/lib/python3.9/site-packages/datamaps/plugins/dft/master.py", line 120, in __init__
self._data = project_data_from_master(self.path)
File "/tmp/toss/lib/python3.9/site-packages/datamaps/plugins/dft/portfolio.py", line 12, in project_data_from_master
wb = load_workbook(master_file)
File "/tmp/toss/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 315, in load_workbook
reader = ExcelReader(filename, read_only, keep_vba,
File "/tmp/toss/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 124, in __init__
self.archive = _validate_archive(fn)
File "/tmp/toss/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 96, in _validate_archive
archive = ZipFile(filename, 'r')
File "/usr/lib/python3.9/zipfile.py", line 1239, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/lemon/Documents/analysis_engine/core_data/master_3_2020.xlsx'
Now that your code is publically available, you should be the effort to tidy up bugs like this and give helpful messages. I know you only intend to use it internally, but you never know! Plus your team will be using it. Take some time to improve the help documentation too - all can be added to --help
.
I haven't seen anything on Teams?
The shell is the name of the program that runs the command line (in our case no Linux we use bash as the shell). For your Windows use, it's fine to use cmd.exe
. There is another shell on Windows called PowerShell, but they probably don't need to use. Yes, you might need to help them set the PATH variable, that can be a pain in Windows, but apart from that, you're correct: pip install analysis_engine
should do it.
Great stuff Will! Keep in touch with how it's going.
By the way, you should also add a license if you haven't already.
Thanks Matt. Yes I haven't given to you the latest master.wb and it's absence if causing the crash. I'll drop it into Teams now. How the master data is uploaded into the programme is something I need to look at and modify because at the moment it's hard coded, and it can't be that way going forward. But yes there will definitely need to be checks and messaging happening at the upload point... related to that is the next step which will be to keep the master data in a database or perhaps as an interim saved in pickles?
Re PATH is this the path that python uses to run? the same issue as the option that is provided when python is downloaded, but not included as default for some reason?
Yes have the licence is in there.
Cheers.
The PATH to where pip
installs packages also needs to be added. The user should get a message about this the first time they install a package using pip
. Without fixing this, they will have to run something like C:\Users\whoever\AppData\Roaming\Local\Programs\PipPackages\asdokoKo3090\analysis.exe ...
instead of analysis ...
.
https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/
@yulqen I'm going to close this issue. Joe and Robert successful install via pip install
both analysis_engine and datamaps onto their machines this morning. Annie should hopefully be able to do it also later today. We didn't encounter any problems with the PATH to where pip installs thankfully.
I also published up a newer package v0.0.2 of analysis_engine this morning. If you want to try it out you need to create a new folder analysis_engine/core_data/pickle . And the command analysis initiate
has to be done first. analysis_engine now stores its Master class data structure into a pickle file, and access data that way, rather than taking the data out of the excel master files each time. See what you think.
Cheers. :rocket:
As discussed, there is scope for simplification here. You don't need Pycharm for running scripts in this repo.
Really, you could replace all the guidance at https://github.com/banillie/analysis_engine#virtual-environment-set-up with something much simpler.
It would mean your staff running commands in a terminal again, but there are advantages to this:
Instead, the could have a cheatsheet of commands to type into a terminal, or even copy and paste and hit return.
I recommend that they use the new Windows 10 terminal (https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701) but
cmd.exe
in Windows would do, or better use the Windows Powershell terminal. I can discuss with you the subtle differences of using these commands in Windows compared to Ubuntu, but in practice nowadays, there is very little difference.If you wanted to remove the whole Pycharm bit from this process, this is how you could set up the virtualenv without it:
git clone https://github.com/banillie/analysis_engine/ C:\Users\username\code\analysis_engine
for example)cd C:\Users\username\code\analysis_engine
python -m venv .venv
.venv\Scripts\activate.ps1
(if they are using Powershell) or.venv\Scripts\activate.bat
(if usingcmd.exe
)pip install -r requirements.txt
Then, when they want to run scripts in your code, they just ensure they are in the
analysis_engine
root directory, activate the virtualenv, and fire off the commands:cd C:\Users\usrename\code\analysis_engine
.venv\Scripts\activate.ps1
python some_script.py