tableau / hyper-api-samples

Sample code to get started with the Hyper API.
https://help.tableau.com/current/api/hyper_api/en-us/index.html
MIT License
132 stars 71 forks source link

java/scala: InsertDataIntoSingleTable - write hyper file to an arbitrary dir #37

Closed xhudik closed 3 years ago

xhudik commented 3 years ago

Hi, I have a problem with writing Hyper file to a temp file (/tmp) directory

...
    val EXTRACT_TABLE = new TableDefinition(
      new TableName("Extract", "Extract"))
      .addColumn("Customer ID", SqlType.text(), NOT_NULLABLE)
      .addColumn("Customer Name", SqlType.text(), NOT_NULLABLE)
      .addColumn("Loyalty Reward Points", SqlType.bigInt(), NOT_NULLABLE)
      .addColumn("Segment", SqlType.text(), NOT_NULLABLE);

    //here is a problem - it cannot create a file/database in tmp dir (it creates empty file)
    val tmpHyperFile = Files.createTempFile("stataggregator-",".hyper")

    val process = new HyperProcess(Telemetry.DO_NOT_SEND_USAGE_DATA_TO_TABLEAU)
    val connection = new Connection(process.getEndpoint(),
      tmpHyperFile.toString(),
      CreateMode.CREATE_AND_REPLACE)

    val catalog = connection.getCatalog();

    catalog.createSchema(new SchemaName("Extract"));
    catalog.createTable(EXTRACT_TABLE);

    val inserter = new Inserter(connection, EXTRACT_TABLE)

    logger.info("TRY will start")

    try {

      // Insert data into "Extract"."Extract" table
      inserter.add("DK-13375").add("Dennis Kane").add(518).add("Consumer").endRow();
      inserter.add("EB-13705").add("Ed Braxton").add(815).add("Corporate").endRow();
      inserter.execute()

      logger.info("DataFrame copied to: " + tmpHyperFile)
...

If I run the code above, I got an empty tmpHyperFile; if the file is defined as: val customerDatabasePath = Paths.get("localfile"); it works well.

Any idea why defining a dir in a hyper file name makes empty files?

xhudik commented 3 years ago

in other words:

   val connection = new Connection(process.getEndpoint(),
      Files.createTempFile("aggregator-",".hyper").toString(),
      CreateMode.CREATE_AND_REPLACE
      )

will end up with an empty temp file in tmp dir; while

   val connection = new Connection(process.getEndpoint(),
     "/tmp/someFile.hyper",
      CreateMode.CREATE_AND_REPLACE
      )

works well and creates a valid /tmp/someFile.hyper with content. Files.createTempFile("aggregator-",".hyper").toString() gives proper String

I looked at log and for the failed run, I found this:

{..."error":"The file \"/tmp/stataggregator-2812899744936367251.hyper\" is not a Hyper database.","elapsed-registry-release":8.97e-07,"elapsed":0.000119845}}
xhudik commented 3 years ago

and finally :) -a full program in Java

import com.tableau.hyperapi.*;

import javax.xml.validation.Schema;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.nio.file.Files;
import static com.tableau.hyperapi.Nullability.NOT_NULLABLE;

/**
 * An example demonstrating a simple single-table Hyper file including table creation and data insertion with different types
 */
public class InsertDataIntoSingleTable {
    /**
     * The table is called "Extract" and will be created in the "Extract" schema.
     * This has historically been the default table name and schema for extracts created by Tableau
     */
    private static TableDefinition EXTRACT_TABLE = new TableDefinition(
            new TableName("Extract", "Extract"))
            .addColumn("Customer ID", SqlType.text(), NOT_NULLABLE)
            .addColumn("Customer Name", SqlType.text(), NOT_NULLABLE)
            .addColumn("Loyalty Reward Points", SqlType.bigInt(), NOT_NULLABLE)
            .addColumn("Segment", SqlType.text(), NOT_NULLABLE);

    /**
     * The main function
     *
     * @param args The args
     */
    public static void main(String[] args) {
        System.out.println("EXAMPLE - Insert data into a single table into a new Hyper file");

        // Path customerDatabasePath = Paths.get("customers.hyper");
        try{
            Path customerDatabasePath = Paths.get(Files.createTempFile("stataggregator-HHHH-",".hyper").toString());

        // Starts the Hyper Process with telemetry enabled to send data to Tableau.
        // To opt out, simply set telemetry=Telemetry.DO_NOT_SEND_USAGE_DATA_TO_TABLEAU.
        try (HyperProcess process = new HyperProcess(Telemetry.SEND_USAGE_DATA_TO_TABLEAU)) {
            // Creates new Hyper file "customer.hyper"
            // Replaces file with CreateMode.CREATE_AND_REPLACE if it already exists
            try (Connection connection = new Connection(process.getEndpoint(),
                    customerDatabasePath.toString(),
                    CreateMode.CREATE_AND_REPLACE)) {
                Catalog catalog = connection.getCatalog();

                catalog.createSchema(new SchemaName("Extract"));
                catalog.createTable(EXTRACT_TABLE);

                // Insert data into "Extract"."Extract" table
                try (Inserter inserter = new Inserter(connection, EXTRACT_TABLE)) {
                    inserter.add("DK-13375").add("Dennis Kane").add(518).add("Consumer").endRow();
                    inserter.add("EB-13705").add("Ed Braxton").add(815).add("Corporate").endRow();
                    inserter.execute();
                }

                // The table names in the "Extract" schema
                List<TableName> tablesInDatabase = catalog.getTableNames(new SchemaName("Extract"));
                System.out.println("Tables available in " + customerDatabasePath + " are: " + tablesInDatabase);

                // Number of rows in the "Extract"."Extract" table
                // executeScalarQuery is for executing a query that returns exactly one row with one column
                long rowCount = connection.<Long>executeScalarQuery(
                        "SELECT COUNT(*) FROM " + EXTRACT_TABLE.getTableName()
                ).get();
                System.out.println("The number of rows in table " + EXTRACT_TABLE.getTableName() + " is " + rowCount + "\n");
            }
            System.out.println("The connection to the Hyper file has been closed");
        }
        System.out.println("The Hyper process has been shut down");
    }catch(Exception e)
{
    e.printStackTrace();
}
}
}

gives:

EXAMPLE - Insert data into a single table into a new Hyper file
com.tableau.hyperapi.HyperException: unable to drop database: The file "/tmp/stataggregator-HHHH-7100515284884419940.hyper" is not a Hyper database.: DatabaseId: "hyper.file:/tmp/stataggregator-HHHH-7100515284884419940.hyper"
Context: 0x5fdfad59
    at com.tableau.hyperapi.Connection.<init>(Connection.java:180)
    at com.tableau.hyperapi.Connection.<init>(Connection.java:81)
    at InsertDataIntoSingleTable.main(InsertDataIntoSingleTable.java:44)
jkammerer commented 3 years ago

Hello, CREATE_AND_REPLACE can only replace hyper files. Internally it uses DROP DATABASE IF EXISTS. In Hyper, this SQL statement will fail if the file exists but is not a database file (for security reasons).

Therefore, Hyper does not create an empty file, Files.createTempFile("aggregator-",".hyper") does. CREATE_AND_REPLACE then fails, leaving you with the empty file you created. In your second example, there is no empty file and CREATE_AND_REPLACE successfully creates a Hyper file.

Cheers, Jonas

xhudik commented 3 years ago

hmmm, I see - it makes sense, maybe some warning about necessity to have filename that doesnt actually exist on FS would be helpfull.

Anyway, thanks a lot for the fast reply!

premsagarreddy commented 2 years ago

@xhudik @jkammerer Hi There,

Please someone share the scala API code,to write .hyper file

xhudik commented 2 years ago

uff- I don't have it anymore, but maybe I'd read https://help.tableau.com/current/api/hyper_api/en-us/docs/hyper_api_examples.html (for java) and try to accommodate for scala. At least, this is what I did

jonas-eckhardt commented 2 years ago

@premsagarreddy feel free to contribute a scala example back to the samples repo (please use the "Community Supported" folder). That would help others who are facing the same question.

Cheers Jonas