Open utterances-bot opened 2 years ago
hello! thank you for posting this code....was the only article i could find that shows how to bring in a table from an excel sheet with multiple tables!
AttributeError Traceback (most recent call last)
Hi, @brianblakeley. Apologies for replying this late. I just saw your message. Have you resolved the issue? I have wrapped this into a function, that should make it easier to use. Let me know if you still have this issue. Again my sincere apologies for the late reply.
This is brilliant! Thank you very much!
Hello @samukweku. Im trying to use your xlsx.table function from janitor to accomplish the above, however I
m unable to import the xlsx.table function. I have successfully downloaded / installed pyjanitor and import the package into the jupyter notebook, however I can failures when I try to import xlsx.table from janitor. Do you have any advice?
@DanielPerkins472 thanks for getting in touch.
can you share your script?
import janitor as in
jn.xlsx_table
Below is the script that I had used
!pip install pyjanitor from janitor import xlsx_table
ImportError Traceback (most recent call last)
@DanielPerkins472 Ahhhh ... You are on windows it seems. I think you should install git on your machine then retry the installation. Hopefully soon we will do a release on pyjanitor and you do not have to go through this. Lemme know how it goes
With this method, my excel formulas are being imported instead of the resultant values. Is there a way to fix this?
Hi @YehudaColton can you share a sample of the file, and the code used?
Hi @samukweku, I was able to get your dev version of janitor installed. I am now getting a new error when I try to import the xlsx_table function though. Any thoughts?
from janitor import xlsx_table
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-e5fafa9ca390> in <module>
----> 1 from janitor import xlsx_table
~\Github\pyjanitor\janitor\__init__.py in <module>
5 pass
6
----> 7 from .functions import * # noqa: F403, F401
8 from .io import * # noqa: F403, F401
9 from .math import * # noqa: F403, F401
~\Github\pyjanitor\janitor\functions\__init__.py in <module>
17
18
---> 19 from .add_columns import add_columns
20 from .also import also
21 from .bin_numeric import bin_numeric
~\Github\pyjanitor\janitor\functions\add_columns.py in <module>
1 import pandas_flavor as pf
2
----> 3 from janitor.utils import check, deprecated_alias
4 import pandas as pd
5 from typing import Union, List, Any, Tuple
~\Github\pyjanitor\janitor\utils.py in <module>
209 "The keyword argument '{argument}' of '{func_name}' is deprecated."
210 ),
--> 211 error: bool = True,
212 ) -> Callable:
213 """
TypeError: 'type' object is not subscriptable
Hi @DanielPerkins472 let me have a look. This is on a windows machine right? If you have access to a Unix machine, could you try and see if it gives the same error?
Hi @samukweku that's correct this is on a windows machine. I don`t have any Unix experience but will see what I can find out.
@DanielPerkins472 I just tested it on my Unix machine and works fine. unfortunately I do not have access to a windows machine to test it out.
Hi @YehudaColton can you share a sample of the file, and the code used?
wb = load_workbook(file_name, data_only=True)
I modified the code slightly by utilizing this 'data_only' tag and it fixed the issue. Thank you for the response however, as well as the wonderful code.
@DanielPerkins472 if you can, create a virtual environment, install only pyjanitor and see if you still get this error message. Let's isolate it and see if it is an OS issue.
I just tested on a windows VM, in a virtual environment and it works fine.
@Samukweku I ended up just using the openpyxl method above in this thread and was successful. I had an outdated version of anaconda / python that was causing issues so I had to completely uninstall / reinstall and then the openpyxl method worked for me.
@samukweku This is a great resource! One query: these solutions require the tables to be "named". Is there some way to pick up contiguous cells as a table from a sheet where they have not been named as a specific table. Hoping I have explained the issue well!
Hi @hepbc , excel tables require names. If they do not have names, then you can just use pd.read_excel
, as they are just cells. If it is a bit more complex, have a look at janitor's xlsx_cells
. Do you have sample data of your problem, with expected output? I can have a look
Access Excel Tables with Python | Samuel Oranyeli
Extract data from a defined table in a spreadsheet
https://samukweku.github.io/data-wrangling-blog/spreadsheet/python/pandas/openpyxl/2020/05/19/Access-Tables-In-Excel.html