yulqen / datamaps

MOVED: Command line interface to yulqen/bcompiler-engine. Used by DfT to collect and clean data using Excel spreadsheets.
https://git.sr.ht/~yulqen/datamaps
MIT License
2 stars 2 forks source link

AttributeError #5

Closed banillie closed 4 years ago

banillie commented 4 years ago

The following command dm.exe export master [file path to master] got this error.

 Traceback (most recent call last):
  File "/home/will/.virtualenvs/wills_datamaps/bin/datamaps", line 10, in <module>
    sys.exit(cli())
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/datamaps/main.py", line 158, in master
    engine_cli.write_master_to_templates(blank, datamap, master)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/engine/adapters/cli.py", line 80, in write_master_to_templates
    uc.execute()
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/engine/use_cases/output.py", line 86, in execute
    file_name = col[0].value.split(".")[0]
AttributeError: 'NoneType' object has no attribute 'split'

Have sent you the files. Cheers.

yulqen commented 4 years ago

Can you also email me the config file - sorry, forgot to ask for that earlier. Ta.

banillie commented 4 years ago

sorry stupid question but I've kind of forgotten where the config file sits now. It's sort of hidden now isn't? Does the user actively use it in Datamaps?

yulqen commented 4 years ago

You might be right Will - it's probably a massive documentation fail on my part.

Have a look in something like - anything related to datamaps - there should be a config.ini file somewhere:

C:\Users\<username>\AppData\Local\<AppAuthor>\<AppName>

or

C:\Users\<username>\AppData\Roaming\<AppAuthor>\<AppName>

Of if you're doing this on Ubuntu, it may be in:

~/.local/share/datamaps/...

Let me know. I suspect if you find it, you might be able to fix your problem. I think the docs for bcompiler still apply here - although I'd need to investigate more thoroughly.

banillie commented 4 years ago

will@will-ThinkPad-X230:~/.local/share/datamaps-data$ ll total 8 drwxr-xr-x 2 will will 4096 Jan 3 23:18 ./ drwx------ 24 will will 4096 Mar 3 14:37 ../

can't seem to find it.

I can vaguely remember that this might be a master wb issue. I seem to remember that it was something to do with how the master spreadsheet was structured, because the error wasn't an issue if you ran it with a master that had been compiled by Datamaps itself. does that ring any bells?

yulqen commented 4 years ago

Is it not inside that directory?

What value is at cell A1 in your master? It is expecting something that resembles a file name - i.e. something .xlsx.

banillie commented 4 years ago

definitely something to do with the master being used. I ran a datamaps export master [file path to master] using a master document that was built by datamaps, that worked fine. Will send you that dm.cvs and master file. I thought it might be something to do with blank columns or blank rows, but deleting columns/rows doesn't seem to make difference.

yulqen commented 4 years ago

I've not had a chance to look at this in detail, nor check the files you sent, but I think the reason this is happening is because you are giving the header cell a "non-file name" entry.

When you use datamaps import the program rather dumbly gives the "title" of each column of data the filename of the source - e.g. "Big Roads Project.xlsm", because it does not make any assumptions about the proper title for the project at that stage. This is reflected when you wish to recreate the file upon export, so when you run datamaps export master [etc] the program finds the name of the intended destination file in that cell.

This is why it works when you use a master created by datamaps but crashes when you create your own master. To get it to work in the interim, you need to ensure the destination file name for the export is included in the cell, e.g. big_roads_project.xlsm, for example.

Thanks for your patience in picking this up (again!) - I will fix it so that this no longer happens. Are you happy for the program to create an exported file such that Big Important Roads Project becomes Big_Important_Road_Project.xlsm? This might have implications for the complexity of project titles (for example Big Important Road Project - Stage 2 - 2021/24 might need a bit of imaginative parsing to create a useable file name, bearing in mind you can't have forward slashes, dots etc in Windows file names), but I can write some stuff to handle this and make it as simple as possible.

Let me know what you think.

banillie commented 4 years ago

Hi Matt have tested this and it doesn't seem to be the issue. couldn't get it to work when inserting .xlsm at end of file name. Plus the master that Datamaps compiled doesn't have .xlsm at the end of the file name. :disappointed:

yulqen commented 4 years ago

Ok - will have to look at this on the train up tonight. Did you find at config file in the end?

banillie commented 4 years ago

Thanks. Sorry about this.

yulqen commented 4 years ago

Apologies are all mine Mr Grant!

yulqen commented 4 years ago

@banillie - did you email me the blank_template.xlsm file?? Can only find an email from yesterday ("other test docs") that contains datamap and master file. I may have missed the email. I can probably use an old template, probably...

yulqen commented 4 years ago

Am running this with datamaps v1.0.1 using the master and datamap (but with an old template, which I don't think matters) you sent and getting no error.

Is there a value in cell B1 of the master you're running? The exception is clearly saying that it is finding no value there and I'm not sure why. Email me in a new email if you don't want to include the value on the publicly available thread (which is a good idea).

banillie commented 4 years ago

Hi Matt I sent you two emails via egress yesterday the first batch included the blank template as well as the master that was raising an error. Did you receive it?

The second batch worked, which is why you’re not getting an error.

yulqen commented 4 years ago

No - only got one email with master and datamap.

M

-- Matthew Lemon matt@matthewlemon.com

On Wed, 4 Mar 2020, at 4:38 PM, Will Grant wrote:

Hi Matt I sent you two emails via where yesterday the first batch included the blank template as well as the master that was raising an error. Did you receive it?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hammerheadlemon/datamaps/issues/5?email_source=notifications&email_token=AAB4ISLTWHIJYDXOBLAL5DDRFZ7YNA5CNFSM4LALAKYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENY2EBI#issuecomment-594649605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4ISP2E3NPNWDA3TA3EQ3RFZ7YNANCNFSM4LALAKYA.

banillie commented 4 years ago

ok have tried sending again, should reach you soon.

banillie commented 4 years ago

@hammerheadlemon did you receive the email I sent yesterday ok?

yulqen commented 4 years ago

I did @banillie but I've not had a chance to look at it yet. The first attempt might been blocked due to it containing a zip a file. Anyway - I've got it and will look at it when I get a chance later today. Sorry not had time yet!

yulqen commented 4 years ago

Can reproduce the error - great. Will dig in later to fix. There's an issue with what openpyxl considers to be the far boundary of the spreadsheet data. Will fix and come back to you very soon.

banillie commented 4 years ago

Thanks. Makes sense. I thought that might be the issue and tried deleting columns, but obviously not enough. Just tested moving all data into a new/blank wb and that worked fine.

yulqen commented 4 years ago

When reading the sheet data at openpyxl.worksheet._reader.WorksheetReader.bind_cells, WorksheetReader.parser.parse() is creating a list row whose values are dicts. In the error master file, starting at idx 7 (row 7 in the sheet), parse() is finding a None value in in row 7 col 18 (col R). There is no visible value in the spreadsheet at this point - there may be a value internally here, but it is being read as None. Regardless, openpyxl is therefore considering col 18 as within the boundary and therefore later when datamaps processes the data extracted from the master is encountering None values - in our problematic case, starting at cell H1.

If you type a value in the spreadsheet at R7, the text is formatted as light blue, and subsequent entries down the rows appear in different colours. Similarly with next door cells.

image

This could be fixed by handling the AttributeError but this complicated by the fact that there are valid None values to be found within the file. The other option is to require the user to specify (using an optional command-line argument) the max column width of the master spreadsheet so that such overruns do not happen. This would provide safety, but obviously changes the UI.

Will investigate further.

banillie commented 4 years ago

clearly something strange happening in the background. just to note the master file (which isn't working) was built from a program I developed to compile a master for commission. that code is here.

https://github.com/banillie/analysis_engine/blob/325f1840e9bb4e0a48f81eb80ebd2e79884804d8/data_mgmt/DfT_commission_data.py#L282

yes I can see how this is a tricky one from a user perspective. presumably the only row of interest in relation to this is row 1. So if Datamaps hits a blank cell there it can assume that this is a stop and start transferring data to templates?

An issue with this would be that if there's a blank column, which is then followed by some columns with data in them, that data would not get transferred. However, this issue could be easily fixed by the user.

yulqen commented 4 years ago

You're right - row 1 is the key, so we could just catch the exception there.

Basically, we can make things a bit more robust in the code and improve documentation to warn the user of the limitations.

Thanks for ref to your code - I'll have a look. Interested to find out what's going on.

In terms of urgency, have you run the export to hit your goal at work?? Did you manage to do it using a clean master file??

banillie commented 4 years ago

yeap we have been able to run the export by moving all the data over to another wb... so that work around solves this issue for us, and no urgency from our point of view.

Thanks for looking into this Matt.

yulqen commented 4 years ago

Fixed in d030d2d2d5be1ee2ec8b4a1939da9ea4cf4506d5.

datamaps.exe v1.0.3 available at Twenty Four Software; released on PyPI as v1.0.3 (use pip install -U datamaps on your system to get the new version).

export function now recognises if it hits a None value in the top cell of a master column and assumes there has been an issue with the datamap, and is trying incorrectly to export data beyond a sensible column boundary. It gives the user a warning in the console log that this has happened but this message may be pushed out of the screen quickly by subsequent warnings such as missing datamap entries - scroll up in the console buffer to view.

Please report further problems.

banillie commented 4 years ago

Great! We’ll test it pronto. What service! Best, Will.

banillie commented 4 years ago

@hammerheadlemon just ran a test on datamaps 1.0.3 and it didn't pass unfortunately. I got this error. Sorry!

Traceback (most recent call last):
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/engine/use_cases/output.py", line 89, in execute
    file_name = col[0].value.split(".")[0]
AttributeError: 'NoneType' object has no attribute 'split'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/will/.virtualenvs/wills_datamaps/bin/datamaps", line 10, in <module>
    sys.exit(cli())
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/datamaps/main.py", line 158, in master
    engine_cli.write_master_to_templates(blank, datamap, master)
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/engine/adapters/cli.py", line 80, in write_master_to_templates
    uc.execute()
  File "/home/will/.virtualenvs/wills_datamaps/lib/python3.7/site-packages/engine/use_cases/output.py", line 91, in execute
    _col_des = string.ascii_uppercase[col[0].column - 1]
IndexError: string index out of range
yulqen commented 4 years ago

Ah, ok. Thanks for taking the time to let me know, and apologies.

Using the same files/command? Are you using PyPI datamaps or datamaps.exe downloaded from the site?

banillie commented 4 years ago

The PyPI one. Should I test the datamaps.exe version as well? Cheers Matt.

yulqen commented 4 years ago

No, it's fine. Presume it's exactly the same test files, etc. I'll take a look.

yulqen commented 4 years ago

I know - stupid me... Can you please test it with 26 cols or less - i.e. no data beyond col Z, and let me know.

banillie commented 4 years ago

Ok tested and it worked if there was only data in it up until column Y, so 25 cols. does that make sense? Thanks Matt.

yulqen commented 4 years ago

Yeah, it does. Thanks for confirming. I implemented the fix without testing scenarios whereby there are over 26 columns. I will work on the fix and post here with an update. Thanks for your patience as ever! Hope you and your family are staying safe and well.

From: Will Grant notifications@github.com Sent: 17 March 2020 16:00 To: hammerheadlemon/datamaps datamaps@noreply.github.com Cc: Lemon matt@matthewlemon.com; State change state_change@noreply.github.com Subject: Re: [hammerheadlemon/datamaps] AttributeError (#5)

Ok tested and it worked if there was only data in it up until column Y, so 25 cols. does that make sense? Thanks Matt.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/hammerheadlemon/datamaps/issues/5#issuecomment-600151497 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4ISJO2QN7AXJTMSQMZS3RH6NAVANCNFSM4LALAKYA . https://github.com/notifications/beacon/AAB4ISOXXW3AZ4TWUVSYBYLRH6NAVA5CNFSM4LALAKYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEPCZLSI.gif

yulqen commented 4 years ago

Export function overruns column boundary

banillie commented 4 years ago

yes were all good thanks, although decided to take Arthur out of school. Same for you and your family.

Cheers, Will.

yulqen commented 4 years ago

@banillie - I do have it on my list to fix and close this issue before you do your next export run.

banillie commented 4 years ago

Thanks Matt. That would be great. 👍

yulqen commented 4 years ago

At some point, it would be great if you could test the fix, @banillie:

In your Ubuntu terminal, do the following:

Use the test files you sent me, or at least a master that has data in cells beyond the end row expected - the test master you sent me had text in cells around row 3000; also ensure there is text in cells beyond the end column - so way after the last column of project data. datamaps should be able to accommodate this now. It will throw out a WARNING log message for every project/column, which I have an issue to fix to make it slightly more user-friendly, but it should not throw and AttributeError. To test:

When you are finished testing, get rid of the test virtualenv:

Report back here. Once we're happy, I will build the executable for your guys to work with.

banillie commented 4 years ago

@hammerheadlemon ok will get onto this asap. might be over the weekend. Cheers, Matt.

banillie commented 4 years ago

Hi Matt,

Tested and worked very well. :raised_hands: :100:

It came up with a nice message to highlight that there were extra cells in the master that the user might to check. However, the export continued and completed nicely with no issues.

Two things to note: I got the below error message when (pip) installing datamaps. That message came up in red. When I did pip list it still showed openpyxl being installed though.

Failed building wheel for openpyxl

The other thing was that when opening each output file/project returns in libreoffice I got a message saying number of the columns, in the that file, was exceeded. This generates a warning message, but the wb opens ok. I don't remember having this message before. When I send them over to the work computer they open in excel fine without an error message though. One last thing I need to check with Joe on Monday is that the macros built into the project returns runs ok. Do you want me to send you some of the output files so you can have a look also.

Once again, thank you for all this.

Best, Will.

yulqen commented 4 years ago

Thanks for doing this Will - really appreciate it. I'm going to do a little bit more tidying on this version before I release it (including better log messaging when the long long export process is running - right now it's a black hole -, so I might need you to do the same again very quickly early next week if you can.

I think the pip message can be fixed by doing a pip install wheel in your virtualenv or doing pip install datamaps --no-cache-dir, although I'm not 100%. It's not essential for a wheel to be present for everything to work. So not to worry. I see the pip messages all the time too.

I've not tested on Libreoffice this time (thanks my current focus on Windows!) but will bear this in mind when I switch testing to Linux.

I'll carry on testing with the files you sent me last week. If you find anything else in the meantime - let me know, and let me have the usual batch of files that are problematic.

banillie commented 4 years ago

Have just sent you a few files via egress, which you might find useful. Look forward to further testing. Cheers, Will.