cloudera / hue

Open source SQL Query Assistant service for Databases/Warehouses
https://cloudera.com
Apache License 2.0
1.17k stars 366 forks source link

Hue database dump not working on hue 4.8.0 #2623

Closed rahil-c closed 2 years ago

rahil-c commented 2 years ago

Is the issue already present in https://github.com/cloudera/hue/issues or discussed in the forum https://discourse.gethue.com?

The issue is discussed in these forums

Describe the bug:

When running hue version 4.8.0 on AWS EMR 5.32.0, running the follow command

/usr/lib/hue/build/env/bin/hue dumpdata > ./hue-mysql.json

it returns the following exception

WARNINGS:
jobbrowser.DagDetails.dag_info: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
    HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
jobbrowser.QueryDetails.hive_query: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
    HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
CommandError: Unable to serialize database: (1146, "Table 'hue.hive_query' doesn't exist")

Cause of issue

Investigation

# You'll have to do the following manually to clean this up:
#   * Rearrange models' order
#   * Make sure each model has one field with primary_key=True
#   * Make sure each ForeignKey has `on_delete` set to the desired behavior.
#   * Remove `managed = False` lines if you wish to allow Django to create, modify, and delete the table
# Feel free to rename the models, but don't rename db_table values or field names.

Hue version or source? (e.g. open source 4.5, CDH 5.16, CDP 1.0...). System info (e.g. OS, Browser...).

open source hue 4.8.0+

Special Note

Can hue community please prioritize this fix, since it seems to be affecting several customers on AWS EMR.

Harshg999 commented 2 years ago

Hi @rahil-c, thank you for a detailed description!

After removing the managed = False suggestion, can you try doing a migration to see if that works? For example, in https://docs.gethue.com/administrator/administration/operations/#commands can you try migrate or makemigrations command?

rahil-c commented 2 years ago

Hi @Harshg999 , thanks for the quick response.

I tried both the following commands,

[$hadoop@ip-1XX-XX_XX] /usr/lib/hue/build/env/bin/hue migrate

WARNINGS:
?: (mysql.W002) MySQL Strict Mode is not set for database connection 'default'
    HINT: MySQL's Strict Mode fixes many data integrity problems in MySQL, such as data truncation upon insertion, by escalating warnings into errors. It is strongly recommended you activate it. See: https://docs.djangoproject.com/en/1.11/ref/databases/#mysql-sql-mode
jobbrowser.DagDetails.dag_info: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
    HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
jobbrowser.QueryDetails.hive_query: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
    HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
Operations to perform:
  Apply all migrations: admin, auth, axes, beeswax, contenttypes, desktop, jobsub, oozie, pig, sessions, sites, useradmin
Running migrations:
  No migrations to apply

[hadoop@ip-1XX-XX_XX ~]$ /usr/lib/hue/build/env/bin/hue makemigrations
System check identified some issues:

WARNINGS:
jobbrowser.DagDetails.dag_info: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
    HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
jobbrowser.QueryDetails.hive_query: (fields.W342) Setting unique=True on a ForeignKey has the same effect as using a OneToOneField.
    HINT: ForeignKey(unique=True) is usually better served by a OneToOneField.
No changes detected

It seems that some other tables are added but not the ones mentioned in the original commit such as hive_query etc. The dumpdata command still experiences the same issue regardless of these commands being ran.

Harshg999 commented 2 years ago

Hi @rahil-c, looks like this code is of a feature which was not fully implemented. You can try reverting the mentioned commit https://github.com/cloudera/hue/commit/f999ac696c88c6c19060f37afaa7c019e28c8ba5 to see if database dump is working again.

Hope it helps!

rahil-c commented 2 years ago

Hi @Harshg999 so I have tried commenting out those classes introduced by above commit f999ac6,while the hue service is running. Once commented out when I run the database dump command I'm able to successfully run the command.

im just curious but would commenting/reverting this commit have any other impact elsewhere in hue?

Harshg999 commented 2 years ago

This should not have any impact elsewhere in Hue since the feature was not fully completed.

It was intended to be used in different database under [[query_database]] in the config. The feature flag is enable_hive_query_browser under [jobbrowser] which is by default false. So, till the time there is no interaction with above things, commenting/reverting is good to go.