Open cainmagi opened 3 years ago
Thanks for the detailed PR! I'm happy to see this progressing. One thing that I'm considered that there might be some side effects that I cannot foresee, and the complexity of the things this package can do is increasing. This can lead into difficulties in maintaining the dash-uploader in the future. What I would suggest, is that there will be some automatic tests added to all the new features. This way we can see that nothing will break when something is changed.
I have to admit that this PR is large even though the API changes are very small. At least, I could confirm that the two examples I provide (as well as the usage.py
in the root folder) could work well.
For the automatic tests stuff, should it be related to the other issue #34? I am guessing that you want to use pytest
, but currently, there are no available testing codes in ./test
.
I know that pytest
could be integrated with Github Actions, and I have tried to do that before. I know how to write testing codes for dash
(although I am not an expert on that), but I find that it is difficult to set up Chrome drivers for Github Actions. I guess that a testing container may be required before setting up Github Actions. I am still learning how to do it now. If you do not mind my poor code style, and only need to perform tests locally, I could try to write some testing codes and start another PR.
I would be delighted to see a PR for some basic testing! :) No need to add GitHub Actions etc. at this stage, just bare minimum to get testing started. This is exactly what issue #34 is about.
One thing what I'm now thinking that the du.configure_upload
does a bit different thing on Flask and on Dash apps, right? I know that you had the flask part in a separate function in a previous PR, but I'm now thinking (just initially) that if it would make more sense for the user that it would be different function (if it does a different thing). Something like du.configure_flask
. The du.configure_upload
could call it internally to prevent duplicate code. But this is just initial thinking. I really should take time to look through the additions in this PR and try to understand the implications from all the possible angles.
One thing I'm thinking is that what is the motivation behind supporting multiple calls to du.configure_upload
, if there is still no full support for multiple du.Upload
components on same app? (referring to https://github.com/np-8/dash-uploader/issues/35) Or is there some chicken-and-egg -problem here?
Please do not try to merge it now. We may need to change the API according to the previous discussion.
If we discard the implementation based on multiple calls to du.configure_upload
, we would need to add an argument like service_addr
or upload_api
to du.Upload
, and make the upload_api
configured in du.configure_upload
become a default value. We need to do that because we need to support cross-domain uploading based on this PR in the next step.
If you think the implementation of multiple calls to du.configure_upload
is preferred. I think we may do not need to split du.configure_flask
from it. Because in the next step, we may need to provide a du.configure_remote_upload
like this
du.configure_remote_upload(app, remote_addr, upload_component_ids)
because an uploader directed to the remote server only needs to handle the callback. The uploading folder, API, and requests are handled by the remote side.
I have realized that in my case, I do not need to use du.configure_upload
or something like du.configure_flask
multiple times, because it is not necessary to configure multiple different APIs for the same app, and it is not possible to serve multiple different apps. But I need to do something like:
du.configure_upload(app, ...)
du.configure_remote_upload(app, remote_addr_1, upload_component_ids=...)
du.configure_remote_upload(app, remote_addr_2, upload_component_ids=...)
...
For the remote side, I only need to configure something like this
du.configure_upload(flask_app, ...)
So serving multiple calls of du.configure_upload
may not be a good choice. But we need to serve multiple calls to something like du.configure_remote_upload
.
How do you think of that? If the multiple calls to du.configure_upload
is not preferred, I could start another PR and close this one. Before that, we may also need to decide whether it is necessary to split du.configure_flask
.
This clearly needs some more thinking with better good night's sleep. Here are some fast notes. These are not necessarily answering to anything, but they can help us later when designing the future API & functionality of dash-uploader.
The Dash
instance will have
url_base_pathname
, that will default to /
. This is not now used in dash-uploader as the following two give more fine-grained controlrequest_pathname_prefix
. It is the prefix for AJAX calls from client (browser). It must end with routes_pathname_prefix
. This defaults to url_base_pathname
if not given.routes_pathname_prefix
. It is the prefix for API routes on backend (flask).There values are read from the dash app
instance and saved to du.settings
, when du.configure_upload
is called.
request_pathname_prefix
and routes_pathname_prefix
and saves it to du.settings
app
to du.settings.app
. This needs to be saved for the du.callback
to function.UPLOAD_FOLDER_ROOT
to du.settings
.<routes_pathname_prefix>
/<upload_api>
or just <upload_api>
, if routes_pathname_prefix
is missing<request_pathname_prefix>
/<upload_api>
or just <upload_api>
, if request_pathname_prefix
is missing. du.settings
(which are saved when calling du.configure_upload
)du.settings
or the app
sending the requests(?). (Possible problem: How to validate users in multi-server case? Would it be better to route the data through the dash app anyway?)Thank you for replying to me. Currently, I tend to drop the implementation of this PR, because I think we may not need the multiple calls for du.configure_upload
. I may continue to work on this project in the next week. Before that, I want to know how you think of du.configure_upload
. Even we drop the multiple-call-based implementation, we could still make the du.configure_upload
supports the Flask app.
First, let me explain my view about authentication. Currently, I have to acknowledge that I have not considered this problem. To my knowledge, the authentication could be maintained in a very complicated (but also very safe) way or a very simple way.
But an important problem is that the authentication is an optional feature for dash. We could not expect that each dash app is equipped with authentication, and we may not know how do the users apply the authentication. For the Flask case, we may have a similar situation. If we want to make the authentication a built-in feature of dash-uploader
for multi-services, we may need to add the following features for the users:
dash_auth.BasicAuth
./login
, the GET
/POST
method would accept a user name and hashed password, and return a random-generated token. The user name and password may be able to be automatically extracted from the dash app. This flask API could be also used for validating whether the remote service is online./login
again. This step may need to be automatic.Because I am not a professional web developer, I am not sure whether these steps are widely acceptable. I know that there are some plugins like flask-login
or flask-jwt
that may make it easier to develop the authentication feature if you do not mind adding them into the dependencies. But in my opinion, providing the authentication feature maybe not necessary, because users could implement the authentication by themselves and use your HttpRequestHandler
to add customized codes for the authentication.
Before the next step, I think we may need to determine the one thing:
du.configure_upload
? My project only requires one configure for either the dash app or the flask app. We may only benefit from the multiple calls for configuring different uploading folders and different uploading APIs. But I think in most cases, we do not need this feature. For the remote service case, we may also need to determine the functionalities. But I think we could discuss it in the next PR about multi-services.
I was thinking that I would have had more time today to think this through, but unfortunately that did not happen. I wish I had few hours just to think on this, so I could give my opinion. I appreciate your time and contribution.
For the authentication case, dash-uploader was first released without any possibility (without changing the source code) of having any kind of authentication. Now there is HttpRequestHandler that I made partly for the support for authentication. I have not much experience on the topic myself either, and as you said, most dash apps are without authentication. I would recommend anyone using dash-uploader or similar to use some sort of authentication and validation for the inputs, and perhaps some upload rate limits per user. I have a feeling that adding authentication to dash-uploader might be a bit complicated, as there might be very different authentication schemes people use. So for now, I think it can be left as the responsibility of the user to subclass the HttpRequestHandler and add their own logic. In the future, perhaps there could be some examples or helper functions, though.
Of course, the dash-uploader should be designed in such a way that authentication is possible (and relatively easy). So the topic is good to be kept in mind.
Yeah that is a valid point, do we need multiple calls to du.configure_upload
. Actually, what I was thinking what would be really helpful as the next step, is to make a list of requirements / wishes on what we really wish to accomplish, and how the user of dash-uploader would use the features. In my opinion, having multiple dash-uploader components on same web app would be "nice to have", but I think I remember only one person asking about if it could be made possible. The React component should be only ever sending AJAX calls to one address, that is for sure. But, should it be possible to send them directly cross-domain (to a flask server?). Or, would it be better to make the Javascript always talk with the "main dash app", so that authentication is easy, and then "pipe" the calls forward to somewhere else, if needed?
Here is a simple diagram which explains what I mean:
Some random thoughts
du.configure_upload
to be called. This means that if there would be multiple du.Upload components on same app, there should be one call to du.configure_upload per component. Hello! I am back for this project now. First, please let me explain my understanding of your thoughts:
dash
could handle the callbacks. As I have proposed before, the locally fired callback function needs to send a request to the remote service, and use the response of the service to provide the output value of the local callback. To implement this workflow. We need to write codes for sending requests (by urllib3
or requests
) from the dashboard side, and codes for handling the requests (by Flask
) from the remote side. I have written some codes for remote forwarding, and successfully worked out a dashboard connected with a remote service. Unfortunately, these codes need to be kept secret by my organization. However, I could still start another PR for providing remote services and explain the idea in detail. Since the mechanism is a little bit complicated, I am wondering whether it should be integrated as a part of dash-uploader
(maybe it is better to let users implement their own APIs?). We could discuss this topic in that PR.dash
is usually equipped with the basic authentication mentioned above. In my opinion, if we need to connect to an existing remote service. It should provide some kinds of authentication (if needed). In most cases, what we need to do is to integrate the provided authentication with dash-uploader
, so I think your HttpRequestHandler
is already a good enough solution. (In my case, I need to write the service from scratch.)Dropbox
, but I have written some codes about Github API v3
before. In my experience, HttpRequestHandler
may not be suitable in this case. Because we could not hack into the server-side codes of those services. The only thing we could do is to use the APIs, which may be quite different from the APIs of resumablejs. It seems to be more practical to call the 3rd party APIs after the file gets uploaded to the dashboard. This process could be finished in the callback.du.configure_upload
, I am a little bit confused. If we need to configure du.configure_upload
for each component, why not merging it with du.Upload
together? I think it may look more reasonable to modify the APIs like this:
du.configure_upload(app=app)
....
du.Upload(
id='upload'
...,
upload_api=...,
upload_folder=...,
)
The du.configure_upload
should only be used for maintaining the app
. It is not necessary to save something like routes_pathname_prefix
, because we could get the value by settings.app.config.get('routes_pathname_prefix', ...)
.
Your reply about this point would give me ideas for the next step of this PR. Thank you!
Hi @cainmagi ,
If I understood this correctly, this
As I have proposed before, the locally fired callback function needs to send a request to the remote service, and use the response of the service to provide the output value of the local callback. To implement this workflow. We need to write codes for sending requests (by urllib3 or requests) from the dashboard side, and codes for handling the requests (by Flask) from the remote side
..says that you were also about to pipe all the AJAX calls (from browser) through the Dash app to your Flask app (and not sending from browser directly to Flask)?
Thank you for the question! I would really much like to simplify the user experience by using du.Upload
independently from du.configure_upload
as you suggested. However, there are some restrictions that lead to the current dash-uploader API
du.configure_upload
configures the Flask routes that the dash app is using. I.e. it configures the server to receive HTTP requests at some endpoint(s). This route configuration needs to be done before the dash app is started. However, some du.Upload
components might not have been created when the app is started. This is the reason that the upload_api
must be in the du.configure_upload
as argument. folder
argument of the du.configure_upload
is absolutely necessary to be there (or could it be moved to the du.Upload
component). I should take a bit time to investigate this, but before I would do that, I should ask a question of "why" it would be beneficial if the du.Upload
component would have the folder
as argument, and not the du.configure_upload
.What I am basically saying is that one needs to have one call to du.configure_upload
(to open a route in the Flask app and define output folder) for each du.Upload
component.
du.Upload
component should upload to different folder, or, if same folder is accepted for multiple du.Upload
components, there should be some logic to make sure that if two files with same name are uploaded at same time with two different components, that there would be no clash between them. So, the easiest way would be just to enforce using different upload foldersNot really. Because
dash-uploader
requests to the remote Flask service. Because with the cross-domain modifications, these requests have been already sent from the browser to the remote service directly. The dash app would not receive those requests in this case.dash-uploader
requests have been directed to the remote service, the /_dash-update-component
request would still be sent to the local dash app, i.e. the callback would be still fired locally. However, we do not need to forward the /_dash-update-component
either, because the remote service is a Flask app, it is not worth writing codes to forward the dash formatted requests and handle them.urllib3
or requests
). Because this request is totally customized, it is easy to control what we want to do on the remote side. In this way, the request would serve as "a flask callback". Because it is always sent by the callback function, if the callback is not fired, the request would not be sent.I think I should draw a figure in the next PR (about the cross-domain implementation) to make this workflow clearer.
configure_upload
Thank you for your explanation. I find that I misunderstand your post again :sweat_smile:. In the previous post, I suggest moving the folder and API in du.Upload
because I think the du.configure_upload
needs to be called multiple times. But now I find that it is not necessary to do that, Because
du.configure_remote_upload(app, remote_addr_1, upload_component_ids=['id1', ...])
du.Upload(id='id1', ...)
I prefer the following style
du.Upload(id='id1', service=remote_addr_1, ...)
Then we do not need to provide multiple calls to du.configure_upload
. Because each app only requires one configuration. For example, the remote services should be configured from the remote side, while the dash app only needs one configuration.
du.configure_upload
and let it support the Flask app. This would be a prerequisite step for the cross-domain implementation.If you think the above statements are OK, I could close this PR and start another PR for supporting the Flask app first. Thank you!
Thanks for the explanations! A diagram would surely help to understand the working principle much faster!
If we would use du.configure_upload
for Flask apps the upload_api
parameter documentation does not make sense as the requests_pathname_prefix
is specific to Dash, I think. Not sure if it should be a separate du.configure_flask
function or not. But that can be also decided after some initial implementation when we see how it goes.
If there are cross-domain services, compared to the following style: du.configure_remote_upload(app, remote_addr_1, upload_component_ids=['id1', ...]) du.Upload(id='id1', ...) I prefer the following style du.Upload(id='id1', service=remote_addr_1, ...)
Do you think it would work just just du.Upload(id='id1', service=remote_addr_1, ...)
? I mean, without any du.configure_remote_upload
calls in the dash app? If so, that would be awesome! There probably needs to be at least du.configure_upload
call in the app, even if there was only one Upload component that would send directly to a remote Flask app, because the callback functionality needs it. I mean, if you want to use du.callback
. This should be something like
configure_upload(app, folder=None, use_upload_id=False, upload_api=None, http_request_handler=None)
But what if you have two du.Upload
components, and you want another one to be configured to send to the main Dash app and another component to send directly to the remote Flask app?
Yes, there some dash-special configurations like requests_pathname_prefix
. However, these configurations are only coded in the function, and not exposed as arguments. My concern is, the dash app
and the Flask app
share the same API:
# If this is a dash app
du.configure_upload(app, folder=None, upload_api=None, use_upload_id=None, http_request_handler=None)
# If this is a flask app
du.configure_upload(app, folder=None, upload_api=None, use_upload_id=None, http_request_handler=None)
All of the arguments would be used for configuring the app in both cases. To reduce the repeats of the codes, we could add a check in the implementation of du.configure_upload
and disable some dash features if the app
is a Flask app.
Yes. I am quite sure that du.Upload(id='id1', service=remote_addr_1, ...)
works, because I have already implemented it in my project before. The reason is that such a component does not need either a local upload folder or a local upload API. In this case, there may be a problem in du.callback
. See the following line:
https://github.com/np-8/dash-uploader/blob/f76252936ffc456a00c3930c569a458af44acd40/dash_uploader/callbacks.py#L36
To make du.callback
correct, we may need to add a check. My idea is to add a hard-coded state. If the property service
is cross-domain, du.callback
would not append the settings.UPLOAD_FOLDER_ROOT
to the file name. Currently, I remove it in the callback function body.
Let me use your example to make a simple explanation:
du.configure_upload
.settings
to set the service
. The second component (remote uploading) is configured by something like du.Upload(..., service='http://xx.xx.xx.xx:xxxx/api-upload')
, the configurations in settings
are not used.du.callback
only check the existence of settings.app
, although we call du.configure_upload
for the first component, actually we could use du.callback
for both components.du.callback
for the second component needs to follow the request-based workflow I mentioned before.The only problem is that, in some cases, we do not need the first component, i.e. we may only have remote uploading components. Then we actually do not need to configure a local upload API. But I think we should still force users to call du.configure_upload
. Because it would not cause any conflict, and it is necessary to register the dash app
for du.callback
.
Introduction
This PR preserves all existed APIs. In other words, the modifications are compatible with previous APIs.
This PR is designed for providing the following new API:
where we have:
du.configure_upload
could be used several times.app
could be adash.Dash
instance or aflask.Flask
instance. The argument type would be checked during the configuration.upload_component_ids
is provided, it could beNone
: This is the default value. It means that thisdu.configure_upload
is designed for default configurations.str
: This is the same as(str, )
tuple
orlist
: A sequence ofstr
. Eachstr
is a component id. The component with the provided id would be configured by the provided configurations.upload_component_ids
. If it is notNone
, each item requires to be a non-emptystr
. And each item should be unique, i.e. repeating configurations for the same component is not allowed.Example
For dash
Users could change the
upload_component_ids
in the seconddu.configure_upload
and check what happens.For Flask
In this case, actually, we do not need to configure
upload_component_ids
, because a Flask app does not have components. It should be only used for managing APIs.For the next step
du.callback
. If the component is not configured for a dash app (not configured or configured for a flask app), thedu.callback
would be forbidden. However, it is possible to implement a "Flask callback". We could invoke the callback by using the request handler after the final chunk uploaded. I am wondering whether we need to implement this feature (maybe this feature should be split into another PR).component ids
, we may have two options for the implementation: (1) Add the component id in the arguments of the request. Then theupload_component_ids
should be preserved because the Flask app could identify the component id. (2) Each "Flask callback" is designed for an API. In this case, we only need to add a check fordu.configure_upload
and forbid users to setupload_component_ids
whenapp
is a Flask app.Update report
du.settings
.update_upload_api
fromdu.upload
todu.configure_upload
.upload_component_ids
todu.configure_upload
. This enables users to set different configurations for different components.du.configure_upload
.