Open vamsikarnika opened 2 months ago
With these changes bundled xtable-utilities jar size is coming around 160 MB. it was coming around 600MB before
With these changes bundled xtable-utilities jar size is coming around 160 MB. it was coming around 600MB before
Thanks for the optimizations @vamsikarnika, added some comments. Can you run the new jar with demos to confirm nothing breaks ? I highly doubt s3/gcs sync will fail without the dependencies for s3/gcs connectors.
With these changes bundled xtable-utilities jar size is coming around 160 MB. it was coming around 600MB before
Thanks for the optimizations @vamsikarnika, added some comments. Can you run the new jar with demos to confirm nothing breaks ? I highly doubt s3/gcs sync will fail without the dependencies for s3/gcs connectors.
Hey @vinishjail97. I'm facing issues running the demos locally in my mac machine. I'm getting segmentation fault while trying to the run the below command. (I'm using M2 Mac )
java -jar xtable-utilities/target/xtable-utilities-0.2.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
Adding terminal crash report below
-------------------------------------
Translated Report (Full Report Below)
-------------------------------------
Process: Terminal [51019]
Path: /System/Applications/Utilities/Terminal.app/Contents/MacOS/Terminal
Identifier: com.apple.Terminal
Version: 2.13 (447)
Build Info: Terminal-447000000000000~1296
Code Type: ARM-64 (Native)
Parent Process: launchd [1]
User ID: 501
Date/Time: 2024-09-19 14:04:33.6076 +0530
OS Version: macOS 13.4.1 (22F770820d)
Report Version: 12
Anonymous UUID: C6BC4607-2EAC-FD44-043D-E0ECE9D0D67E
Sleep/Wake UUID: CE8D2B4E-2C85-4BE8-A588-C203561F81AB
Time Awake Since Boot: 65000 seconds
Time Since Wake: 1707 seconds
System Integrity Protection: enabled
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_PROTECTION_FAILURE at 0x000000016ebffd00
Exception Codes: 0x0000000000000002, 0x000000016ebffd00
Termination Reason: Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process: exc handler [51019]
I'm seeing this error during dynamic attaching of jar.
2024-09-19 14:27:03 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from file:/tmp/hudi-dataset/people/.hoodie/metadata
2024-09-19 14:27:03 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from file:/tmp/hudi-dataset/people/.hoodie/metadata/.hoodie/hoodie.properties
2024-09-19 14:27:03 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from file:/tmp/hudi-dataset/people/.hoodie/metadata
# WARNING: Unable to get Instrumentation. Dynamic Attach failed. You may add this JAR as -javaagent manually, or supply -Djdk.attach.allowAttachSelf
With these changes bundled xtable-utilities jar size is coming around 160 MB. it was coming around 600MB before
Thanks for the optimizations @vamsikarnika, added some comments. Can you run the new jar with demos to confirm nothing breaks ? I highly doubt s3/gcs sync will fail without the dependencies for s3/gcs connectors.
Yeah, you are right. we need these deps during runtime. mvn dependency:analyze only checks dependencies required during compile time.
I've removed some of the dependencies like aws-sdk-bundle
and confirmed sync is still working with s3. But after these changes jar size hasn't reduced by much.
What is the purpose of the pull request
Brief change log
Verify this pull request
This pull request is a trivial rework / code cleanup without any test coverage.