enso-org / enso

Enso Analytics is a self-service data prep and analysis platform designed for data teams.
https://ensoanalytics.com
Apache License 2.0
7.39k stars 323 forks source link

Lots of warning logs after changing code followed by `buildStdLibAll` #9036

Closed radeusgd closed 9 months ago

radeusgd commented 9 months ago

As discussed on standup, we get lots of warning messages about IR.

I got a simple repro for it.

  1. Do a sbt> buildEngineDistribution.

  2. Run e.g. AWS_Tests - it runs 3 tests (others are skipped if not set up, that's all fine) and prints no warnings.

  3. Modify a file, e.g.:

    diff --git a/distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3_File.enso b/distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3_File.enso
    --- a/distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3_File.enso   (revision f9f134ad33c7dc3e4fbe89e0ba4d9db087f6ee55)
    +++ b/distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3_File.enso   (date 1707765346318)
    @@ -174,6 +174,7 @@
        Deletes the object.
     delete : Nothing
     delete self = if self.is_directory then Error.throw (S3_Error.Error "Deleting S3 folders is currently not implemented." self.uri) else
    +        IO.println "Hello world"
         if Context.Output.is_enabled.not then Error.throw (Forbidden_Operation.Error "Deleting an S3_File is forbidden as the Output context is disabled.") else
             translate_file_errors self <| S3.delete_object self.bucket self.prefix self.credentials . if_not_error <| Nothing
  4. Do sbt> buildStdLibAll (the 'fast rebuild' that lib devs can use for quick testing to avoid slow re-indexing).

  5. Run AWS_Tests again.

Now I'm getting these warnings:

[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Errors from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.AWS_Credential from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Main from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.S3.S3 from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Internal.Request_Body from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Internal.S3_File_Write_Strategy from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Database.Redshift.Redshift_Details from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Database.Redshift.Internal.Redshift_Dialect from IR file: true
[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Database.Redshift.Internal.Redshift_Error_Mapper from IR file: true
radeusgd commented 9 months ago

The warnings do seem to go away if I do another sbt> buildEngineDistribution and run again.

So it seems that it's only occurring (at least for me) if I use the 'incremental' buildStdLibAll. But I use it all the time as buildEngineDistribution is too slow for efficient work.

JaroslavTulach commented 9 months ago

The warning indicates that IR for some module from standard distribution is loaded from .ir file and not from .bindings file. When something like that happens, it is an erroneous state. That makes me believe buildStdLibAll builds the distribution incorrectly.

JaroslavTulach commented 9 months ago

buildEngineDistribution is too slow for efficient work.

It should be faster since https://github.com/enso-org/enso/pull/8961#pullrequestreview-1862466810 as now the up to date check works.

radeusgd commented 9 months ago

buildEngineDistribution is too slow for efficient work.

It should be faster since #8961 (review) as now the up to date check works.

Indeed, it seems faster. Now the rebuild after just changing an Enso file takes about 15s, it used to be far worse than that. Still, the buildStdLibAll usually took about <2s for such small changes. It's still a difference between effectively instant and wasting a bit of patience which is draining when repeated many times during the day.

JaroslavTulach commented 9 months ago

It is 3s on 5c7947ce606e1cb5e79bd10ce4222c84068d8409 and my Ubuntu:

sbt:enso> buildEngineDistribution
[info] Assembly jar up to date: runtime.jar
[info] Assembly jar up to date: runner.jar
[info] Engine package created at built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev
[success] Total time: 3 s

but yes, it is slower than buildStdLibAll which is almost instant.

JaroslavTulach commented 9 months ago

Modifying the S3_File.enso and running:

sbt:enso> runEngineDistribution --run test/AWS_Tests

works correctly. No warnings are printed.

Akirathan commented 9 months ago

A lot of these warnings are also on a clean build in engine benchmarks, for example at https://github.com/enso-org/enso/actions/runs/7908972611/job/21589179363#step:8:1296

GitHub
Benchmark Engine · enso-org/enso@8489316
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.
JaroslavTulach commented 9 months ago

When I run I also get a lot of warnings, but the real problem is: Who creates the .ir files?

sbt:enso> runtime-integration-tests/testOnly *Equals* -- -z org.enso.interpreter.test.EqualsTest.testTruffleBoolean
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Numbers from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Widget_Helpers from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Boolean from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Nothing from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Function from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Vector from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Range.Extensions from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Illegal_Argument from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Panic from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime.State from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Common from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime.Source_Location from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Existing_File_Behavior from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Generic.Writable_File from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Output_Stream from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Input_Stream from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.File_Access from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Advanced.Temporary_File from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime.Ref from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Any from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Ordering from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Ordering.Natural_Order from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Text_Ordering from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Case_Sensitivity from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Meta from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Meta.Enso_Project from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime.Managed_Resource from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Polyglot from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Array from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Index_Sub_Range from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Random from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Platform from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Environment from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File_By_Line from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Logging from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Extensions from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Json from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Json.Extensions from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Warning from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Set from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Map from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.No_Such_Key from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Pair from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Array_Proxy from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Array_Like_Helpers from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Filter_Condition from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Metadata from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Regex from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Regex.Internal.Replacer from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Illegal_State from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Span from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Range from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Empty_Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Regex.Match from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Regex.Internal.Match_Iterator from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Prim_Text_Helper from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Math from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Helpers from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Locale from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Case from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Unimplemented from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Problem_Behavior from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Maybe from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.List from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Text_Sub_Range from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Time_Of_Day from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Time_Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Time_Zone from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Date_Time from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Time_Period from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Duration from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Period from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Date from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Day_Of_Week from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Date_Time_Formatter from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Time.Format.Tokenizer from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Errors from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Time.Format.Parser from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Time.Format.As_Java_Formatter_Interpreter from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Time.Format.Analyzer from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Date_Range from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Time.Date_Period from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Encoding_Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Matching_Mode from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Location from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Encoding from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.File_Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File_Format_Metadata from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File_Format from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.URI from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Enso_Cloud.Utils from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.Header from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Enso_Cloud.Enso_Secret from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.Request_Body from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.HTTP_Method from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.Response_Body from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Write_Extensions from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.HTTP_Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.HTTP_Status_Code from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.Proxy from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.Response from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.Request from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.Extensions from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Sort_Direction from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Base_64 from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Enso_Cloud.Enso_User from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Enso_Cloud.Enso_File from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Enso_Cloud.Errors from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Regression from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Statistics from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Interval from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.XML from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.XML.XML_Format from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Java_Json from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Network.HTTP.HTTP_Version from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Generic.File_Like from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Process from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Process.Process_Builder from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.Process.Exit_Code from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Wrapped_Error from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Line_Ending_Style from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Text.Normalization from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Ordering.Vector_Lexicographic_Order from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Generic.File_Write_Strategy from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.File_Permissions from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.System.File.Local_File_Write_Strategy from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime.Debug from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Runtime.Thread from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.IO from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Number_Builtins from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Internal.Rounding_Helpers from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Main from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Errors.Deprecated from IR file: true

I thought removing the user caches would fix the problem, but no:

enso$ rm -rf ~/.local/share/enso/cache/ir/

isn't enough! The .ir files must be somewhere else! I believe they are read from ./target/fake_dir/cache/ir/!

JaroslavTulach commented 9 months ago

Removing the fake_dir fixes the problem, for a first run, but then the .ir files get recreated:

enso$ rm -rf target/fake_dir/
sbt:enso> runtime-integration-tests/testOnly *Equals* -- -z org.enso.interpreter.test.EqualsTest.testTruffleBoolean
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
sbt:enso> runtime-integration-tests/testOnly *Equals* -- -z org.enso.interpreter.test.EqualsTest.testTruffleBoolean
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Data.Numbers from IR file: true
[enso::SerializationPool] WARNING: Deserializing module Standard.Base.Widget_Helpers from IR file: true
...

looks like the unit tests ignore the .bindings files.

JaroslavTulach commented 9 months ago

This change seems to fix the problem by preventing unit tests to write down .ir caches:

enso$ git diff
diff --git engine/runtime-integration-tests/src/test/java/org/enso/interpreter/test/TestBase.java engine/runtime-integration-tests/src/test/java/org/enso/interpreter/test/TestBase.java
index 5c45398415..b80e06e218 100644
--- engine/runtime-integration-tests/src/test/java/org/enso/interpreter/test/TestBase.java
+++ engine/runtime-integration-tests/src/test/java/org/enso/interpreter/test/TestBase.java
@@ -46,6 +46,7 @@ public abstract class TestBase {
         .allowIO(IOAccess.ALL)
         .allowAllAccess(true)
         .option(RuntimeOptions.LOG_LEVEL, Level.WARNING.getName())
+        .option(RuntimeOptions.DISABLE_IR_CACHES, "true")
         .logHandler(System.err)
         .option(RuntimeOptions.STRICT_ERRORS, "true")
         .option(

shall I create a PR to integrate it, @Akirathan?

Update: Done in https://github.com/enso-org/enso/pull/9070/commits/3fde985866860dccd2cc07d884ade933e5128ac2

radeusgd commented 9 months ago

It is 3s on 5c7947c and my Ubuntu:

sbt:enso> buildEngineDistribution
[info] Assembly jar up to date: runtime.jar
[info] Assembly jar up to date: runner.jar
[info] Engine package created at built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev
[success] Total time: 3 s

but yes, it is slower than buildStdLibAll which is almost instant.

I get the same logs as above and the same timing (3s) only when there are no changes. But if there are no changes, why would I rebuild at all??

However, when I try to change one line in AWS library, for some reason it causes the JARs to be rebuilt and totals ~20s:

[IJ]buildEngineDistribution
[info] List(cargo, build, -p, enso-parser-jni, -Z, unstable-options, --out-dir, X:\NBO\enso\target\rust\debug)
[info] compiling 1 Scala source to X:\NBO\enso\lib\scala\version-output\target\scala-2.13\classes ...
[info] done compilingwarning: `--out-dir` filename collision.
The lib target `enso_parser` in package `enso-parser-jni v0.1.0 (X:\NBO\enso\lib\rust\parser\jni)` has the same output filename as the lib target `enso_parser` in package `enso-parser-jni v0.1.0 (X:\NBO\enso\lib\rust\parser\jni)`.
<rust warnings...>
    Finished dev [unoptimized] target(s) in 0.35s

[info] 5627 file(s) merged using strategy 'First' (Run the task at debug level to see the details)
[info] 10 file(s) merged using strategy 'Concat' (Run the task at debug level to see the details)
[info] 2 file(s) merged using strategy 'PreferProject' (Run the task at debug level to see the details)
[info] 51 file(s) merged using strategy 'Discard' (Run the task at debug level to see the details)
[info] Built: runtime.jar
[info] Jar hash: 49a5e8238a573e82375b773158293de350f51ba2
[info] 29 file(s) merged using strategy 'First' (Run the task at debug level to see the details)
[info] 8 file(s) merged using strategy 'Concat' (Run the task at debug level to see the details)
[info] 76 file(s) merged using strategy 'Discard' (Run the task at debug level to see the details)
[info] Built: runner.jar
[info] Jar hash: be75e8cd395ebea7f9e3794dd19c1ef0fac8541b

[info] Regenerating manifest for [distribution\lib\Standard\AWS\0.0.0-dev].
[info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\AWS
[info] Engine package created at built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev
[success] Total time: 20 s, completed 15 lut 2024, 12:10:42
[IJ]

Similarly, if I change something in Base (which is vastly larger than AWS), that time jumps up to 27s.

The 3s would be livable. The ~30s is of course bearable but much more annoying than mere 3s.

One of the issues is that the JARs for some reason get rebuilt (sounds like a bug in the incrementality?). But even if they weren't, I think at least 15s of that time is the time to re-index the changed library (which takes time especially for larger libs like Base).

radeusgd commented 9 months ago

I tried reporting timings:

Index: project/DistributionPackage.scala
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/project/DistributionPackage.scala b/project/DistributionPackage.scala
--- a/project/DistributionPackage.scala (revision dd6049ee2d7fc030d5c58b9332181b94c4cda430)
+++ b/project/DistributionPackage.scala (date 1707995779324)
@@ -1,14 +1,15 @@
 import io.circe.yaml
-import io.circe.syntax._
+import io.circe.syntax.*
 import sbt.internal.util.ManagedLogger
-import sbt._
+import sbt.*
 import sbt.io.syntax.fileToRichFile
 import sbt.util.{CacheStore, CacheStoreFactory, FileInfo, Tracked}

-import scala.sys.process._
-
+import scala.sys.process.*
 import org.enso.build.WithDebugCommand

+import java.time.Duration
+
 object DistributionPackage {

   /** File extensions. */
@@ -248,11 +249,16 @@
           path.toString
         )
         log.debug(command.mkString(" "))
+        val start = System.nanoTime()
         val exitCode = Process(
           command,
           None,
           "JAVA_OPTS" -> "-Dorg.jline.terminal.dumb=true"
         ).!
+        val end = System.nanoTime()
+        log.info(
+          s"Generating index for $libName took ${Duration.ofNanos(end - start)}."
+        )
         if (exitCode != 0) {
           throw new RuntimeException(s"Cannot compile $libMajor.$libName.")
         }
Index: project/LibraryManifestGenerator.scala
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/project/LibraryManifestGenerator.scala b/project/LibraryManifestGenerator.scala
--- a/project/LibraryManifestGenerator.scala    (revision dd6049ee2d7fc030d5c58b9332181b94c4cda430)
+++ b/project/LibraryManifestGenerator.scala    (date 1707995830453)
@@ -1,6 +1,8 @@
-import sbt._
+import sbt.*
 import sbt.util.CacheStoreFactory

+import java.time.Duration
+
 /** A helper for generating manifests for bundled libraries. */
 object LibraryManifestGenerator {

@@ -64,6 +66,7 @@
     )

     log.debug(s"Running [$command].")
+    val start = System.nanoTime()
     val exitCode = sys.process
       .Process(
         command,
@@ -71,6 +74,10 @@
         "ENSO_EDITION_PATH" -> file("distribution/editions").getCanonicalPath
       )
       .!
+    val end = System.nanoTime()
+    log.info(
+      s"Generating manifest for $projectPath took ${Duration.ofNanos(end - start)}."
+    )
     if (exitCode != 0) {
       val message = s"Command [$command] has failed with code $exitCode."
       log.error(message)

With that, I even somehow managed to avoid the JAR rebuild when modifying AWS:

[info] Assembly jar up to date: runtime.jar
[info] Assembly jar up to date: runner.jar
[info] Regenerating manifest for [distribution\lib\Standard\AWS\0.0.0-dev].
[info] Generating manifest for distribution\lib\Standard\AWS\0.0.0-dev took PT2.0367343S.
[info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\AWS
[info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\AWS took PT6.6912344S.
[info] Engine package created at built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev
[success] Total time: 11 s, completed 15 lut 2024, 12:18:35
[IJ]

And for modifying a file in Base:

[info] Assembly jar up to date: runtime.jar
[info] Assembly jar up to date: runner.jar
[info] Regenerating manifest for [distribution\lib\Standard\Base\0.0.0-dev].
[info] Generating manifest for distribution\lib\Standard\Base\0.0.0-dev took PT3.9066128S.
[info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base
[info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\Base took PT11.9769456S.
[info] Engine package created at built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev
[success] Total time: 19 s, completed 15 lut 2024, 12:19:53

As we can see re-generating the index takes 6-12 seconds depending on the size of the library. That is the biggest slowness in this process I think.

Moreover, we have re-generating the manifest on each change which takes additional 2-3 seconds. IIRC this manifest is only really needed for dependency detection when installing libraries - so not at all needed during regular development of std-lib. Maybe we could disable it? OR maybe we could bake it into the index-regeneration process that runs anyway? The manifest essentially analyzes what modules are imported within the package to figure out the dependencies. I assume from the 2-3 seconds it runs, most of it is JVM warmup, and a small part the parser - both of which run anyway as part of the index generator. So it seems that running it separately is not very useful - it used to be run before we even had caches and the indexing so it had to be a separate process, no there's no reason it would have to be separate. OTOH it requires some work to change that and the win wouldn't be that big - only 2-3s per library.

radeusgd commented 9 months ago

I guess maybe what we have to do is avoid the buildEngineDistribution in the default libs workflow altogether (ofc keeping it as a 'clean build' tool).

  1. For just .enso file changes, @jdunkerley was proposing a symlink based solution that allows the engine to use the sources that we edit directly, skipping the build + index-rebuild steps. That will likely be fastest? Shall we try to 'standardize' this workflow by creating commands that can set it up?
  2. When modifying Java lib helpers we used to rely on buildStdLibAll to rebuild the Java helpers JARs. If buildStdLibAll is broken for the IR, maybe we should retire it, but add a buildStdLibJavaHelpers command that only does the 'rebuild the helper JARs' part?

This way we could likely live without buildStdLibAll at all, relying on the already rather fast buildEngineDistribution for a 'clean build' and relying on faster tools for just small incremental tweaks to the libraries (as these are most common and most annoying when they require long iteration times).

radeusgd commented 9 months ago
  1. For just .enso file changes, @jdunkerley was proposing a symlink based solution that allows the engine to use the sources that we edit directly, skipping the build + index-rebuild steps. That will likely be fastest? Shall we try to 'standardize' this workflow by creating commands that can set it up?

I was thinking if we can integrate it more into our development flow, as relying on the symlink (that IIRC gets overwritten after every buildEngineDistribution) feels a bit hacky. I think we should be able to exploit the editions infrastructure to configure the resolver in such a way to use the libraries within distribution (not built-distribution) directly without file-system hacks. I'd be happy to look into it a bit.

JaroslavTulach commented 9 months ago

[info] List(cargo, build, -p, enso-parser-jni, -Z, unstable-options, --out-dir, X:\NBO\enso\target\rust\debug) [info] compiling 1 Scala source to X:\NBO\enso\lib\scala\version-output\target\scala-2.13\classes ... [info] done compilingwarning: --out-dir filename collision. The lib target enso_parser in package enso-parser-jni v0.1.0 (X:\NBO\enso\lib\rust\parser\jni) has the same output filename as the lib target enso_parser in package enso-parser-jni v0.1.0 (X:\NBO\enso\lib\rust\parser\jni).

This Rust recompilation isn't happening on Linux, but I have seen it on Mac OS X. In any case it is shouldn't be happening - nobody is changing the Rust parser during regular workflows (with the exception of @kazcw).

JaroslavTulach commented 9 months ago

Radek, you are trying to invent various "shortcuts" to avoid...

[info] Regenerating manifest for [distribution\lib\Standard\AWS\0.0.0-dev]. [info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-dev\lib\Standard\AWS

...but that's not going to work. The fact that you skip generating index is exactly the reason why you are seeing the log message! If .bindings file is missing or outdated, then an .ir file is saved instead. The message

[WARN] [2024-02-12T20:16:11+01:00] [enso.org.enso.interpreter.runtime.SerializationPool] Deserializing module Standard.AWS.Errors from IR file: true

will appear on next startup. There is no way around it. Your development workflow isn't caching friendly.

radeusgd commented 9 months ago

Radek, you are trying to invent various "shortcuts" to avoid...

[info] Regenerating manifest for [distribution\lib\Standard\AWS\0.0.0-dev]. [info] Generating index for built-distribution\enso-engine-0.0.0-dev-windows-amd64\enso-0.0.0-

  1. Is the manifest needed to run at all? I don't think so. IIRC it was created only for the purpose of library dependency resolution.

There is no way around it. Your development workflow isn't caching friendly.

  1. We used to be able to have short iteration times in the past. I'm just trying to recover that as it is immensely helpful when working on libraries. Given that Enso does not have any static analysis support in the IDE, the only way to iterate on it is by running tests. In a daily workflow I need to run these 10s-100s of times and having to wait for it for a longer time vastly decreases the efficiency (not just time-wise but focus-wise as it's easy to get distracted waiting for the thing to build).

I guess if for now we decide that the current state is OK - ok. But I'm just looking for ideas to improve this. Maybe we could consider an incremental cache which only invalidates modified modules and stuff that depends on them? Maybe the symlink solution suggested by James + running with no IR caches would prove to give us quicker iteration times?

I think these are questions worth asking.

radeusgd commented 9 months ago

Your development workflow isn't caching friendly.

Just to add 2 cents - this is the main workflow when developing Enso libraries. So I think trying to make it easy to work with is really important to work nicely.

When modifying the Java libraries we can benefit from incremental compilation and changing just one file often rebuilds in <3s which is great.

While I understand it may not worth it to spend resources improving it right now, I feel like summing up with '(Enso developer workflow) isn't caching friendly' suggests that Enso developers cannot dream of a similar experience at all. Maybe I read this wrong though.

JaroslavTulach commented 9 months ago

running with no IR caches would prove to give us quicker iteration times

That's one option. The other is to disable the log message.

radeusgd commented 9 months ago

running with no IR caches would prove to give us quicker iteration times

That's one option. The other is to disable the log message.

Is that safe? I understood that the log message is there because this behaviour is somehow bad.

If there's nothing stopping us, maybe we should disable the log message in a 'dev' mode (e.g. version 0.0.0-dev OR some env var) and keep the log message otherwise? This way we'd know if this happens in unexpected circumstances, but can keep our workflow.

enso-bot[bot] commented 9 months ago

Jaroslav Tulach reports a new STANDUP for yesterday (2024-02-20):

Progress: - fixed https://github.com/enso-org/enso/pull/9101

Next Day: Rust parser