OpenRefine / OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it
https://openrefine.org/
BSD 3-Clause "New" or "Revised" License
10.85k stars 1.94k forks source link

Support multiple worksheets per project à la Excel workbooks #3630

Open HichemDax opened 3 years ago

HichemDax commented 3 years ago

Our most case scenario is that we receive data from our customers to import in a new system, the sheets are mostly very related to each other like items master and items transactions etc. uploading each sheet in a separate project is time consuming and makes it less handy when it comes to cross checks and foreign keys and vlookups.

Proposed solution

I'd suggest to have the facility to import multiple sheets in one project and have that facility to vlookup or make cascading changes like if I update an item code in the items list it updates all instances of that item code in the item transactions.

franparras commented 3 years ago

This functionality works really well in Openrefine. Not sure if I am missing something.

HichemDax commented 3 years ago

@franparras are you sure you can import multiple sheets within same project? I will be very happy to show me how. I tried and it combined them all columns and all rows from all the sheets in one dataset.

franparras commented 3 years ago

Yes I am.. :) tomorrow I will paste here an screenshot with the evidence :) it is by default so you don´t need to do anything special.

franparras commented 3 years ago

Excel Workbook Openrefine.pptx

franparras commented 3 years ago

@franparras are you sure you can import multiple sheets within same project? I will be very happy to show me how. I tried and it combined them all columns and all rows from all the sheets in one dataset.

Document attached

HichemDax commented 3 years ago

Unfortunately, that's not what am looking for.

tfmorris commented 3 years ago

OpenRefine will import as many sheets as you want from a workbook into a single project, but it doesn't have the concept of a multisheet project. That would be a major change to the data model and unlikely to happen in the near future.

I'd suggest thinking about ways that you can either work with everything in the same project (you have the option of tagging each row with its source sheet name) or importing each sheet into an individual project.

HichemDax commented 3 years ago

OpenRefine will import as many sheets as you want from a workbook into a single project, but it doesn't have the concept of a multisheet project. That would be a major change to the data model and unlikely to happen in the near future.

I'd suggest thinking about ways that you can either work with everything in the same project (you have the option of tagging each row with its source sheet name) or importing each sheet into an individual project.

Thank you @tfmorris.

franparras commented 3 years ago

I guess we need to bear in mind that my recipe here is the wow factor and not really create a project with multiple sheets, in my mind, handling the dataset and the rules is the key value, and not really a static dataset. Personally I don´t see value to change what we have today, OpenRefine is not excel.

wetneb commented 3 years ago

I wouldn't mind keeping an open issue about this - as @tfmorris explained, making it possible for projects to contain multiple tables is a major change that is unlikely to happen soon, but I think it is still a legitimate ask. Even if we don't have immediate plans to work on this it would be nice to gather use cases around it.

Pinging @swkasica who has been thinking about better multi-table support in ETL tools (https://arxiv.org/abs/2009.02373). This is also related to #892.

steve-kasica commented 3 years ago

I agree that OpenRefine is not Excel, and that a tool that is really good at a few things is ultimately more useful than a tool that is only OK at many things. But my perspective on this is that a lot of the tasks that OpenRefine does really well, entity resolution via clustering for example, are closely tied to this issue of merging multiple datasets together, and I would bet some users would appreciate having that functionality in OpenRefine.