mckinsey / vizro

Vizro is a toolkit for creating modular data visualization applications.
https://vizro.readthedocs.io/en/stable/
Apache License 2.0
2.46k stars 109 forks source link

Consider adding a file uploader widget #473

Open yury-fedotov opened 1 month ago

yury-fedotov commented 1 month ago

Which package?

vizro

What's the problem this feature will solve?

I am building a small pet project on Vizro now and ran into a limitation that there is no file upload widget. Per this conversation (private chat) I see that it is possible to achieve this functionality even now, but the workaround implies writing a lot of custom code.

I'm opening this issue to test the hypothesis that maybe such widget is worth being added to the package.

Hard to say without user research, but my feeling is that this feature might expand Vizro usability quite a lot. Because it will allow Vizro to be used in a whole new type of applications which are about displaying user-supplied data of known schema. In addition to a use case where Vizro already shines at, which is displaying pre-defined data from known source

Describe the solution you'd like

Something as simple as Streamlit version of that.

Alternative Solutions

However I understand that Streamlit can provide such simple widget definition because of its script-like syntax where a widget is just assigned to a variable that should be collected from it. While Vizro features a completely different design pattern that requires users to explicitly build a Page object etc. And what makes it even more complex is that Vizro solution should have a YAML API to define that widget in addition to the Python workflow.

Additional context

NA

Code of Conduct

antonymilne commented 1 month ago

Hello @yury-fedotov, thanks very much for raising this issue - as you say, it's been asked before (actually also in #281) and I've added it to our tracker which now shows that at least 3 independent people have requested this functionality so hopefully we can bump its priority up.

Definitely I think it's a reasonable ask and it could expand usage of vizro, and I'm keen to add it as a feature. One reason it hasn't been added already, other than lots of things on the backlog with competing priorities, is that it's not quite as simple as it might sound. As alluded to in my response on #281, Dash somewhat muddies the water here because it supports at least two different ways to upload a file, and then what's also not clear on the Vizro side is what to do with the file once it has been uploaded.

It would be useful for our user research if you could explain a bit more how you might imagine this working, e.g. would you like the file to be persisted somewhere or just update things on screen in an ephemeral way? Would the uploaded file have an affect on other user's sessions or not? (As a general rule, putting something together that works for a toy project locally is much easier than coming up with a robust solution here. So if you're the only user of the dashboard and you don't care about security it's much easier to solve this upload problem. But on vizro we need to design everything robustly to work statelessly and securely so that we can scale to production with many untrusted concurrent users easily, which does make it harder to implement functionality.)

And what makes it even more complex is that Vizro solution should have a YAML API to define that widget in addition to the Python workflow.

This bit is actually not so hard since it's basically done for us by the magic of pydantic! Once we have a vizro model worked out, so long as it has field types that easily convert to JSON, this basically comes for free 🙁

yury-fedotov commented 1 month ago

Thanks for detailed reply @antonymilne , I agree with all you said.

It would be useful for our user research if you could explain a bit more how you might imagine this working, e.g. would you like the file to be persisted somewhere or just update things on screen in an ephemeral way?

Well, to take my use case as an example, what I wanted to build is a interface where users could upload an Excel file with known schema (i.e. column names) and it would draw a dashboard based on that. Each user upload their own version of such file in their session, and it's not shared with other users' sessions.