microsoft / hyperspace

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
https://aka.ms/hyperspace
Apache License 2.0
424 stars 115 forks source link

Refactor Content.rec(..) function to support tail-optimisation #402

Closed dmytroDragan closed 3 years ago

dmytroDragan commented 3 years ago

What is the context for this pull request?

What changes were proposed in this pull request?

Refactor Content.rec(..) function to support tail-optimisation

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests should be enough, because it is just refactoring without changing functionality.

ghost commented 3 years ago

CLA assistant check
All CLA requirements met.

dmytroDragan commented 3 years ago

Hi @andrei-ionescu ,

  1. Sure (might make sense to add also scalafmt version for alignment)
  2. There are already tests for Content class like "Content.files api lists all files from Content object" which rely on it, but I could move rec function to Content object and add more.
sezruby commented 3 years ago

@dmytroDragan Sorry but, could you revert unnecessary changes from scalafmt to reduce the diff? It can be handled with a different PR (would be appreciated if you could push a new PR!), so that we could review & merge only required change for tailrec.

dmytroDragan commented 3 years ago

Hi @sezruby! I can revert changes from scalafmt, but it makes a usage of scalafmt useless in first place.

I`m using 2.7.5 (latest) scalamft with dev conf:

# The following configs are taken from https://github.com/apache/spark/blob/master/dev/.scalafmt.conf
align = none
align.openParenDefnSite = false
align.openParenCallSite = false
align.tokens = []
optIn = {
  configStyleArguments = false
}
danglingParentheses = false
docstrings = JavaDoc
maxColumn = 98

# The following are specific to Hyperspace.
importSelectors = singleLine

If anyone gets different result for this files (like old once), please reach out to me and we could try to figure out.

I could make a PR with only scalafmt of this two files, so my PR will cover only my changes after merge. What do you think?

andrei-ionescu commented 3 years ago

@dmytroDragan, Here is my scalafmt setup in IntelliJ Idea (I have 1.5.1 version):

hyperspace_scalafmt

I usually do Command+Shift+Alt+L to reformat the file and use the following options in the dialog box:

hyperspace_reformat

The formatting is correct, the only annoying thing is that it order the functions by name too. I will be great if you could double check the diffs before pushing the changes.

dmytroDragan commented 3 years ago

Thank you @andrei-ionescu. I have rechecked my configuration and applied result formatting. I have created PR with just scalafmt formated code for this 2 files, so reviewing it could be easier: https://github.com/microsoft/hyperspace/pull/412

dmytroDragan commented 3 years ago

Th problem was with 1.5.1 vs 2.7.5 scalafmt. Fixed version is done as new PR: https://github.com/microsoft/hyperspace/pull/417