side effects are not always contained in first expression of chain

It's possible for side effects that should be contained in the first expression of chain, which is the expression that is executed in a sub-execution environment, to emerge in the containing execution and also be planned, executed and drained there.

To reproduce, start with an empty dest bucket and run the following in OSS

import "csv"
import "experimental"
import "system"

csvdata ="
#group,false,false,true,true,false,false,true,true
#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double,string,string
#default,_result,,,,,,,
,result,table,_start,_stop,_time,_value,_field,_measurement
,,0,2018-04-06T10:49:41.565Z,2020-04-06T11:49:41.564Z,2020-02-22T15:01:00Z,50,bottom_degrees,h2o_temperature
"

A = csv.from( csv: csvdata )
        |> map( fn: (r) => ({ r with _time: system.time() }) )
        |> to( bucket: "dest" )

B = csv.from( csv: csvdata )

experimental.chain( first: A, second: B )

The output will include the following, which should not be present.

Result: to5
Table: keys: [_field, _measurement, _start, _stop]
         _field:string     _measurement:string                     _start:time                      _stop:time                      _time:time                  _value:float  
----------------------  ----------------------  ------------------------------  ------------------------------  ------------------------------  ----------------------------  
        bottom_degrees         h2o_temperature  2018-04-06T10:49:41.565000000Z  2020-04-06T11:49:41.564000000Z  2021-04-07T21:36:28.975398046Z                            50

And the dest bucket will contain two points, when it should contain only one.

Result: _result
Table: keys: [_start, _stop, _field, _measurement]
                   _start:time                      _stop:time           _field:string     _measurement:string                      _time:time                  _value:float  
------------------------------  ------------------------------  ----------------------  ----------------------  ------------------------------  ----------------------------  
1970-01-01T00:00:00.000000000Z  2021-04-07T21:36:30.210508450Z          bottom_degrees         h2o_temperature  2021-04-07T21:32:27.284064796Z                            50  
1970-01-01T00:00:00.000000000Z  2021-04-07T21:36:30.210508450Z          bottom_degrees         h2o_temperature  2021-04-07T21:32:27.321756596Z                            50

An attempt to reproduce this in pure flux, using the sql.to function, does not work. The writing does not occur twice.

[thurston@peyto] table-find-side-effects: sqlite3 /tmp/to.db
sqlite> create table t ( _start datetime, _stop datetime, _time datetime, _measurement string, _field string, _value bigint );

Then run the following, only one row will show up, as expected.

import "csv"
import "experimental"
import "system"
import "sql"

csvdata ="
#group,false,false,true,true,false,false,true,true
#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double,string,string
#default,_result,,,,,,,
,result,table,_start,_stop,_time,_value,_field,_measurement
,,0,2018-04-06T10:49:41.565Z,2020-04-06T11:49:41.564Z,2020-02-22T15:01:00Z,50,bottom_degrees,h2o_temperature
" 

A = csv.from( csv: csvdata )
        |> map( fn: (r) => ({ r with _time: system.time() }) )
        |> sql.to( 
                driverName: "sqlite3",
                dataSourceName: "file:/tmp/to.db?cache=shared&mode=rw",
                table: "t" )

B = csv.from( csv: csvdata )

experimental.chain( first: A, second: B )

influxdata / flux

side effects are not always contained in first expression of chain #3614