apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
938 stars 301 forks source link

[Subtask] Snippet check and remediation #685

Closed justinmclean closed 10 months ago

justinmclean commented 10 months ago

Copied code may have wrong headers or AI code generation has accidentally copied code. Run a snippet checker on the code base to find any issues and fix any IP issues that arise.

justinmclean commented 10 months ago

Used ScanOSS (https://www.scanoss.com) it is free for community use. I'm not 100% sure how it compares to FossID, but it looks like it has a larger database (3 trillion lines of known OSS code, 100 billion known OSS files, 209 million known OSS URLs).

justinmclean commented 10 months ago

Issues found:

@jerryshao Missing ASF header /api/src/main/java/com/datastrato/gravitino/exceptions/RESTException.java https://github.com/apache/iceberg/blob/f3e50717149b61d4701c2691be50cd2442afc967/api/src/main/java/org/apache/iceberg/exceptions/RESTException.java

@xunliu A section e.g. check_java_version, addEachJarInDirRecursive, addJarInDir functions in /bin/common.sh are copied from somewhere.

@FANNG1 copied EXCEPTION_ERROR_CODES /catalogs/catalog-lakehouse-iceberg/src/main/java/com/datastrato/gravitino/catalog/lakehouse/iceberg/web/IcebergExceptionMapper.java https://github.com/apache/iceberg/blob/f3e50717149b61d4701c2691be50cd2442afc967/core/src/test/java/org/apache/iceberg/rest/RESTCatalogAdapter.java

@jerryshao Missing ASF header: /common/src/main/java/com/datastrato/gravitino/rest//Users/justin/graviton/common/src/main/java/com/datastrato/gravitino/rest/RESTMessage.java https://github.com/apache/iceberg/blob/f3e50717149b61d4701c2691be50cd2442afc967/core/src/main/java/org/apache/iceberg/rest/RESTMessage.java

@jerryshao Missing ASF header: /common/src/main/java/com/datastrato/gravitino/rest/RESTUtils.java https://github.com/apache/iceberg/blob/f3e50717149b61d4701c2691be50cd2442afc967/core/src/main/java/org/apache/iceberg/rest/RESTUtil.java

@yuqi1129 Missing ASF header (and license and notice information) /core/src/main/java/com/datastrato/gravitino/utils/Bytes.java https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/utils/Bytes.java

jerryshao commented 10 months ago

I think it's OK for me to add the ASF header for my parts, they're mainly referred from Apache Iceberg.

justinmclean commented 10 months ago

Everything has been updated except: @xunliu A section e.g. check_java_version, addEachJarInDirRecursive, addJarInDir functions in /bin/common.sh are copied from somewhere.

xunliu commented 10 months ago

hi @justinmclean

@xunliu A section e.g. check_java_version, addEachJarInDirRecursive, addJarInDir functions in /bin/common.sh are copied from somewhere.

I copy from https://github.com/apache/submarine/blob/master/bin/common.sh

justinmclean commented 10 months ago

@xunliu Can you update the header and add the file to LICENSE and add this to NOTICE: "Apache Submarine Copyright 2019 and onwards The Apache Software Foundation."