apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.61k stars 3.55k forks source link

Arrow nightly Maven releases don't seem to work #12570

Open GavinRay97 opened 2 years ago

GavinRay97 commented 2 years ago

Following the instructions listed here:

I get the following error when trying to install. I think the content type is being mis-interpreted (as HTML rather than XML)

[WARNING] The POM for org.apache.arrow:arrow-flight:jar:8.0.0.dev165 is invalid, transitive dependencies (if any) will not be available: 1 problem was encountered while building the effective model
[FATAL] Non-parseable POM C:\Users\rayga\.m2\repository\org\apache\arrow\arrow-flight\8.0.0.dev165\arrow-flight-8.0.0.
dev165.pom: expected = after attribute name (position: TEXT seen ...l="preconnect" href="https://github.githubassets.com" crossorigin>... @15:77)  @ line 15, column 77
lidavidm commented 2 years ago

@davisusanibar were you able to get this to work?

GavinRay97 commented 2 years ago

I've used a regular GitHub repository as a Maven repository before, for that you have to use the "raw" URL:

repositories {
  maven {
    name "expecty"
    url "https://raw.github.com/pniederw/expecty/master/m2repo/"
  }
}

Maybe something like this might be needed for using tagged releases too? I checked the POM it pulled and it's the HTML of the GitHub page rather than the actual asset.

(I wanted to start prototyping a project with FlightSQL but there was some issue with it making it into the v7.0.0 release POMs)

Today I will try to write a script that takes the URL to the nightly Java releases, downloads all the assets, and then creates the proper M2 folder structure for the version number.

I'll publish last night's releases to this repo and share the URL for anyone else who might want a temporary fix until the 7.0.1 or 8.0.0 staging releases 👍

GavinRay97 commented 2 years ago

Here is a Node.js script to download from the Nightlies and extract the assets into Maven repository structure, and the 03/03 jars published as usable M2 repo.

Instructions for use with Gradle/Maven are here: https://github.com/GavinRay97/arrow-nightlies-repo

davisusanibar commented 2 years ago

Hi Team, sorry to join late

Thank you @GavinRay97 , library are downloaded but it's invalid pom/jar.

Related to update Arrow Java Nightly Doc ... I just reviewing the issue and I see 02 options:

  1. Analyze/review/configure how to integrate/use github nightly release in a transparent manner
  2. Define workaround to build arrow java nightly dependencies locally:
    • Add your repo to documentation
    • Define more generic integration (without modifying/adding more configuration) to add to the documentation

Just working on a generic nightly build implementation using this shell script:

Code to add to the docs:

#!/bin/bash

# Shell variables
ARROW_JAVA_NIGHTLY_VERSION=${1:-'nightly-2022-03-03-0-github-java-jars'}
DEPENDENCY_TO_INSTALL=${2:-'arrow'}

# Local Variables
TMP_FOLDER=arrow_java_$(date +"%d-%m-%Y")
PATTERN_TO_GET_LIB_AND_VERSION='([a-z].+)-([0-9].[0-9].[0-9].dev[0-9]+).([a-z]+)'

# Aplication logic
echo $DEPENDENCY_TO_INSTALL
mkdir -p $TMP_FOLDER
pushd $TMP_FOLDER
echo "**************** 1 - Download arrow-java $1 dependencies ****************"
wget $( \
    wget \
        -qO- https://api.github.com/repos/ursacomputing/crossbow/releases/tags/$ARROW_JAVA_NIGHTLY_VERSION \
        | jq -r '.assets[] | select((.name | endswith(".pom")) or (.name | endswith(".jar"))) | .browser_download_url' \
        | grep $DEPENDENCY_TO_INSTALL )

echo "**************** 2 - Install arrow java libraries to local repository ****************"
for LIBRARY in $(ls | grep -E '.jar' | grep dev); do
    [[ $LIBRARY =~ $PATTERN_TO_GET_LIB_AND_VERSION ]]
    FILE=$PWD/${BASH_REMATCH[0]}
    if [[ ( ${BASH_REMATCH[0]} == *"$DEPENDENCY_TO_INSTALL"* ) ]];then
        if [ -f "$FILE" ]; then
            FILE=$FILE
        else
            if [ -f "$FILE.jar" ]; then # Out of regex: -javadoc.jar / -sources.jar
                FILE=$FILE.jar
            else 
                if [ -f "$FILE-with-dependencies.jar" ]; then # Out of regex: -with-dependencies.jar
                    FILE=$FILE-with-dependencies.jar
                else 
                    echo "Please! Review $FILE, it was not intalled on m2 locally."
                fi
            fi
        fi
        echo "$FILE"
        mvn install:install-file \
            -Dfile="$FILE" \
            -DgroupId=org.apache.arrow \
            -DartifactId=${BASH_REMATCH[1]} \
            -Dversion=${BASH_REMATCH[2]} \
            -Dpackaging=${BASH_REMATCH[3]} \
            -DcreateChecksum=true \
            -Dgenerate.pom=true
    fi
done
popd
# rm -rf $TMP_FOLDER
echo "Go to your project and execute: mvn clean install"

Execute: Download all dependencies / Or only jar needed

# Download all dependencies
sh arrow_java_nightly.sh nightly-2022-03-03-0-github-java-jars

# Download needed library, for example: memory
sh arrow_java_nightly.sh nightly-2022-03-03-0-github-java-jars memory

Use: Go to your pom.xml add dependencies and version needed

     ...
    <properties>
        <arrow.version>8.0.0.dev165</arrow.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.arrow</groupId>
            <artifactId>arrow-memory-core</artifactId>
            <version>${arrow.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.arrow</groupId>
            <artifactId>arrow-memory-netty</artifactId>
            <version>${arrow.version}</version>
        </dependency>
        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>${logback.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.arrow</groupId>
            <artifactId>flight-core</artifactId>
            <version>${arrow.version}</version>
        </dependency>
    </dependencies>
    ...

Run:

mvn clean install

Please if you could help me if this work on your side.

Thank you in advance.

lidavidm commented 2 years ago

A JIRA was filed here: https://issues.apache.org/jira/browse/ARROW-15865