Open FridrikLax opened 9 months ago
setting WTF_CSRF_ENABLED = False
had no affect
Sample command along with logs from 3.0.0
vs. 2.1.0
superset-cli -u admin -p admin --loglevel debug http://localhost:8089 import-assets ./ --overwrite
3.0.0
127.0.0.1 - - [26/Sep/2023:14:41:30 +0000] "GET /superset/welcome/ HTTP/1.1" 302 201 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET /login/ HTTP/1.1" 200 51492 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET /api/v1/database/?q=(filters:!(),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100) HTTP/1.1" 401 39 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET /login/ HTTP/1.1" 200 51490 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "POST /login/ HTTP/1.1" 302 189 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET / HTTP/1.1" 302 223 "http://localhost:8088" "Apache Superset Client (0.2.8)"
2023-09-26 14:41:31,722:WARNING:root:Class 'werkzeug.local.LocalProxy' is not mapped
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET /superset/welcome/ HTTP/1.1" 302 201 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET /login/ HTTP/1.1" 200 51485 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:31 +0000] "GET /api/v1/database/?q=(filters:!(),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100) HTTP/1.1" 401 39 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "GET /login/ HTTP/1.1" 200 51490 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "POST /login/ HTTP/1.1" 302 189 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "GET / HTTP/1.1" 302 223 "http://localhost:8088" "Apache Superset Client (0.2.8)"
2023-09-26 14:41:32,393:WARNING:root:Class 'werkzeug.local.LocalProxy' is not mapped
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "GET /superset/welcome/ HTTP/1.1" 302 201 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "GET /login/ HTTP/1.1" 200 51490 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "GET /api/v1/database/?q=(filters:!(),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100) HTTP/1.1" 401 39 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "GET /login/ HTTP/1.1" 200 51488 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:32 +0000] "POST /login/ HTTP/1.1" 302 189 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:33 +0000] "GET / HTTP/1.1" 302 223 "http://localhost:8088" "Apache Superset Client (0.2.8)"
2023-09-26 14:41:33,079:WARNING:root:Class 'werkzeug.local.LocalProxy' is not mapped
127.0.0.1 - - [26/Sep/2023:14:41:33 +0000] "GET /superset/welcome/ HTTP/1.1" 302 201 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:33 +0000] "GET /login/ HTTP/1.1" 200 51491 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:33 +0000] "GET /api/v1/database/?q=(filters:!(),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100) HTTP/1.1" 401 39 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:33 +0000] "GET /login/ HTTP/1.1" 200 51491 "http://localhost:8088" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:33 +0000] "POST /login/ HTTP/1.1" 302 189 "http://localhost:8088" "Apache Superset Client (0.2.8)"
2.1.0
:
127.0.0.1 - - [26/Sep/2023:14:41:39 +0000] "GET /superset/welcome/ HTTP/1.1" 200 27302 "-" "python-requests/2.31.0"
127.0.0.1 - - [26/Sep/2023:14:41:39 +0000] "GET /api/v1/database/?q=(filters:!(),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100) HTTP/1.1" 200 729 "http://localhost:8089" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:39 +0000] "GET /api/v1/database/?q=(filters:!(),order_column:changed_on_delta_humanized,order_direction:desc,page:1,page_size:100) HTTP/1.1" 200 519 "http://localhost:8089" "Apache Superset Client (0.2.8)"
127.0.0.1 - - [26/Sep/2023:14:41:39 +0000] "GET /api/v1/database/export/?q=%21%285%29 HTTP/1.1" 200 759 "http://localhost:8089" "Apache Superset Client (0.2.8)"
Updating dbs Trino
2023-09-26 14:41:39,915:INFO:superset.models.helpers:Updating dbs Trino
127.0.0.1 - - [26/Sep/2023:14:41:39 +0000] "POST /api/v1/assets/import/ HTTP/1.1" 200 17 "http://localhost:8089" "Apache Superset Client (0.2.8)"
Having exactly the same issue, I believe it's to do with redirects.
I've had some success getting it to at least auth with JWT by ensuring there's no redirect for the trailing /
when hitting the csrf endpoint:
# auth/superset.py
...
class SupersetJWTAuth(TokenAuth): # pylint: disable=abstract-method
...
- response = self.session.get(
- self.baseurl / "api/v1/security/csrf_token/", # type: ignore
- headers={"Authorization": f"Bearer {jwt}"},
- )
+ url = str(self.baseurl / "api/v1/security/csrf_token/")
+ url = str(url).endswith("/") and str(url) or str(url) + "/"
+ response = self.session.get(
+ url, # type: ignore
+ headers={"Authorization": f"Bearer {jwt}"},
+ )
...
This avoids the redirect due to str(yarl.URL)
dropping the trailing slash (which is what happens under the hood if you follow the get request into the requests package—requests.models.PreparedRequest.prepare_url()
)
That gets me a bit farther, but it then bombs at the next hurdle due to an https request getting redirected to http by superset, I'm not sure if this then is a superset issue instead of superset-cli.
Logs before change:
[12:53:02] DEBUG [[12:53:02]] DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): mysuperset.site:443 connectionpool.py:1014
DEBUG [[12:53:02]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET /api/v1/security/csrf_token connectionpool.py:473
HTTP/1.1" 308 321
DEBUG [[12:53:02]] DEBUG: urllib3.connectionpool: Starting new HTTP connection (1): mysuperset.site:80 connectionpool.py:245
DEBUG [[12:53:02]] DEBUG: urllib3.connectionpool: http://mysuperset.site:80 "GET /api/v1/security/csrf_token/ connectionpool.py:473
HTTP/1.1" 301 134
DEBUG [[12:53:02]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET /api/v1/security/csrf_token/ connectionpool.py:473
HTTP/1.1" 401 39
Logs after change:
[12:48:03] DEBUG [[12:48:03]] DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): mysuperset.site:443 connectionpool.py:1014
[12:48:04] DEBUG [[12:48:04]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET /api/v1/security/csrf_token/ connectionpool.py:473
HTTP/1.1" 200 105
[12:48:07] DEBUG [[12:48:07]] DEBUG: preset_cli.api.clients.superset: GET superset.py:433
https://mysuperset.site/api/v1/database?q=(filters:!((col:database_name,opr:eq,value:default_sandbox)),order_column:
changed_on_delta_humanized,order_direction:desc,page:0,page_size:100)
DEBUG [[12:48:07]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET connectionpool.py:473
/api/v1/database?q=(filters:!((col:database_name,opr:eq,value:default_sandbox)),order_column:changed_on_delta_humanized,order_d
irection:desc,page:0,page_size:100) HTTP/1.1" 308 591
DEBUG [[12:48:07]] DEBUG: urllib3.connectionpool: Starting new HTTP connection (1): mysuperset.site:80 connectionpool.py:245
DEBUG [[12:48:07]] DEBUG: urllib3.connectionpool: http://mysuperset.site:80 "GET connectionpool.py:473
/api/v1/database/?q=(filters:!((col:database_name,opr:eq,value:default_sandbox)),order_column:changed_on_delta_humanized,order_
direction:desc,page:0,page_size:100) HTTP/1.1" 301 134
DEBUG [[12:48:07]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET connectionpool.py:473
/api/v1/database/?q=(filters:!((col:database_name,opr:eq,value:default_sandbox)),order_column:changed_on_delta_humanized,order_
direction:desc,page:0,page_size:100) HTTP/1.1" 401 39
ERROR [[12:48:07]] ERROR: preset_cli.lib: { lib.py:98
"msg": "Missing Authorization Header"
}
Ok, so it's more redirect stuff (see the 301). Keep changing stuff to not redirect:
# api/clients/superset.py
...
def get_resources(self, resource_name: str, **kwargs: Any) -> List[Any]:
...
url = self.baseurl / "api/v1" / resource_name / "" % {"q": query}
+ url = str(url)
+ url = url.replace("?", "/?").replace("//?", "/?")
...
...
def create_resource(self, resource_name: str, **kwargs: Any) -> Any:
"""
Create a resource.
"""
url = self.baseurl / "api/v1" / resource_name / ""
+ url = str(url)
+ url = url.endswith("/") and url or url + "/"
...
Lots after this change (sync seems to work 🎉 ):
[13:49:30] DEBUG [[13:49:30]] DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): mysuperset.site:443 connectionpool.py:1014
DEBUG [[13:49:30]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET /api/v1/security/csrf_token/ connectionpool.py:473
HTTP/1.1" 200 105
https://mysuperset.site/api/v1/database/?q=(filters:!((col:database_name,opr:eq,value:some_value)),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100)
[13:49:33] DEBUG [[13:49:33]] DEBUG: preset_cli.api.clients.superset: GET superset.py:435
https://mysuperset.site/api/v1/database/?q=(filters:!((col:database_name,opr:eq,value:some_value)),order_column
:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100)
DEBUG [[13:49:33]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET connectionpool.py:473
/api/v1/database/?q=(filters:!((col:database_name,opr:eq,value:some_value)),order_column:changed_on_delta_humanized,order_
direction:desc,page:0,page_size:100) HTTP/1.1" 200 518
INFO [[13:49:33]] INFO: preset_cli.cli.superset.sync.dbt.databases: No database connection found, creating it databases.py:72
https://mysuperset.site/api/v1/database/
{'database_name': 'some_value', 'is_managed_externally': False, 'masked_encrypted_extra': None, 'sqlalchemy_uri': 'some_uri'}
DEBUG [[13:49:33]] DEBUG: preset_cli.api.clients.superset: POST https://mysuperset.site/api/v1/database/ superset.py:459
{
"database_name": "some_value",
"is_managed_externally": false,
"masked_encrypted_extra": null,
"sqlalchemy_uri":
"some_uri"
}
[13:49:36] DEBUG [[13:49:36]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "POST /api/v1/database/ HTTP/1.1" 201 connectionpool.py:473
377
https://mysuperset.site/api/v1/dataset/?q=(filters:!((col:database,opr:rel_o_m,value:2),(col:schema,opr:eq,value:some_value2),(col:table_name,opr:eq,value:some_value3)),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_size:100)
DEBUG [[13:49:36]] DEBUG: preset_cli.api.clients.superset: GET superset.py:435
https://mysuperset.site/api/v1/dataset/?q=(filters:!((col:database,opr:rel_o_m,value:2),(col:schema,opr:eq,value:some_value2
),(col:table_name,opr:eq,value:some_value3)),order_column:changed_on_delta_humanized,order_d
irection:desc,page:0,page_size:100)
DEBUG [[13:49:36]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET connectionpool.py:473
/api/v1/dataset/?q=(filters:!((col:database,opr:rel_o_m,value:2),(col:schema,opr:eq,value:some_value2),(col:table_nam
e,opr:eq,value:some_value3)),order_column:changed_on_delta_humanized,order_direction:desc,page:0,page_si
ze:100) HTTP/1.1" 200 413
INFO [[13:49:36]] INFO: preset_cli.cli.superset.sync.dbt.datasets: Creating dataset model.name.some_value3 datasets.py:125
https://mysuperset.site/api/v1/dataset/
{'database': 2, 'schema': 'some_value2', 'table_name': 'some_value3'}
DEBUG [[13:49:36]] DEBUG: preset_cli.api.clients.superset: POST https://mysuperset.site/api/v1/dataset/ superset.py:459
{
"database": 2,
"schema": "some_value2",
"table_name": "some_value3"
}
[13:49:42] DEBUG [[13:49:42]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "POST /api/v1/dataset/ HTTP/1.1" 201 connectionpool.py:473
3238
[13:49:51] DEBUG [[13:49:51]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "PUT connectionpool.py:473
/api/v1/dataset/1?override_columns=true HTTP/1.1" 200 317
DEBUG [[13:49:51]] DEBUG: preset_cli.api.clients.superset: GET https://mysuperset.site/api/v1/dataset/1 superset.py:398
DEBUG [[13:49:51]] DEBUG: urllib3.connectionpool: https://mysuperset.site:443 "GET /api/v1/dataset/1 HTTP/1.1" 200 connectionpool.py:473
5674
This is obviously the wrong solution, it should work with redirects, but at least this shows that that is where the issue is.
See here for linked superset issue: https://github.com/apache/superset/issues/25359
I've had success patching issues to connect to my Superset instance.
There central fix was to modify the UsernamePasswordAuth
to bring in the JWT token into authorisation.
It logs in, gets csrf, fetches a JWT token, and returns the Authorization: Bearer
header for requests.
Can PR this as is, but I'm guessing you would wanna implement it cleaner:
class UsernamePasswordAuth(Auth): # pylint: disable=too-few-public-methods
"""
Auth to Superset via username/password.
"""
def __init__(self, baseurl: URL, username: str, password: Optional[str] = None):
super().__init__()
self.csrf_token: Optional[str] = None
self.baseurl = baseurl
self.username = username
self.password = password
self.token = None
self.auth()
def get_headers(self) -> Dict[str, str]:
headers = {}
if self.token:
headers["Authorization"] = f"Bearer {self.token}"
if self.csrf_token:
headers["X-CSRFToken"] = self.csrf_token
return headers
def auth(self) -> None:
self._login_and_store_csrf()
self._fetch_and_store_token()
def _login_and_store_csrf(self) -> None:
"""
Login to get CSRF token and set cookies.
"""
data = {"username": self.username, "password": self.password}
response = self.session.get(self.baseurl / "login/")
soup = BeautifulSoup(response.text, "html.parser")
input_ = soup.find("input", {"id": "csrf_token"})
csrf_token = input_["value"] if input_ else None
if csrf_token:
self.session.headers["X-CSRFToken"] = csrf_token
data["csrf_token"] = csrf_token
self.csrf_token = csrf_token
# set cookies
self.session.post(self.baseurl / "login/", data=data)
def _fetch_and_store_token(self) -> None:
"""
Fetch the JWT token to use for headers
"""
data = {
"username": self.username,
"password": self.password,
"provider":"db",
"refresh":True,
}
api_login_url = self.baseurl / "api/v1/security/login"
response = self.session.post(api_login_url, json=data)
self.token = response.json()['access_token']
Note I did originally see similar issues with redirects to the trailing slash url and did make some tweaks to that but I'm not certain they're critical. Possibly. The missing Auth header was the key issue I discovered.
Looks like
import-assets
andsync
commands insuperset-cli (0.2.8)
do not work with superset version3.0.0
Tried it with both basic authentication and jwt. Works with 2.1.0 but fails on 3.0.0.Sample command:
superset-cli --jwt-token {jwt_token} --loglevel debug {HOST} sync native ./
The error
When using jwt-token, line 736 in superset.py returns a response object that contains HTML (authentication failed) so it fails in line 742 when trying to extract json
When using basic auth it fails even earlier