dyne / reflow-os

Base scripts to run Reflow OS
7 stars 2 forks source link

Material Passport generation unclear #28

Open sbocconi opened 2 years ago

sbocconi commented 2 years ago

(Sorry for the long log outputs)

I generate the following flow on http://reflow-demo.dyne.org:4000/api/explore using Node-red and the script in this repository :

Mon Jan 31 16:45:55 CET 2022 - Logged user hospital in, id: "01FT69JP6XX5ZF0XKDT960P7S9", token: "QTEyOEdDTQ.cbv_PMyfCYs0fSkAHe2tx56AylUbFpxECChhjyyhg3_xV37Fpmbap4cIk28.DxFhXQSuG3wQ41zV.Jb2UyMfqLmw5-vhvT98lC5jbf-MGhcwTKD68wsfHK9THt9I4ceViYjnVvXYrEf4r.9ednSavUAy-_ZnYDY9GuWQ"
Mon Jan 31 16:45:55 CET 2022 - Logged user cleaner in, id: "01FT69JVNXZD48QANV2GHQP3ZY", token: "QTEyOEdDTQ.FNfjZV-I7YfE7Pi_fSbgzB2EO8FpZZRv93o_5fZF545PtxOTQOMCIPckpAM.5EislXnBhG_HiX46.oeskZCvqAFkJT2dbHoNzJ0JhhubbdAqK_BGs0iE8o3vPC1o0vS1uJ6iErBqF4sh7.RdVRcMUlFb0ZmksrX8kqlA"
Mon Jan 31 16:45:56 CET 2022 - Reading units from file init_reflow-demo.dyne.org.json
Mon Jan 31 16:45:56 CET 2022 - Read location for OLVG, id: "01FTR5XKWQDV7MAKM0PR3776MW"
Mon Jan 31 16:45:56 CET 2022 - Read location for CleanLease Eindhoven, id: "01FTR5XNRF2JYDKR3A2D1WABJN"
Mon Jan 31 16:45:56 CET 2022 - Read unit for gowns, id: "01FTR5XR0P5R3N3GY2ZAQDJWZ3"
Mon Jan 31 16:45:56 CET 2022 - Read unit for mass (kg), id: "01FTR5XRQYFYZ733W3S574H50Q"
Mon Jan 31 16:45:56 CET 2022 - Read unit for volume (litre), id: "01FTR5XS47KJ55YZCYH0W8YFXV"
Mon Jan 31 16:45:56 CET 2022 - Read unit for time (hour), id: "01FTR5XSKAS697BBGKEMP9TX2B"
Mon Jan 31 16:46:01 CET 2022 - Created 1 gown with tracking id: gown-14260, id: "01FTRD0V4FQ3DPNYYMW8W64B70" owned by the cleaner, event id: "01FTRD0Y78GP23493FK30222BR"
Mon Jan 31 16:46:05 CET 2022 - Created 100 kg soap with tracking id: soap-27862, id: "01FTRD10PH451TEDW88TA1WPZZ" owned by the cleaner, event id: "01FTRD12C0G0ZJH0QS636FCFR9"
Mon Jan 31 16:46:10 CET 2022 - Created 50 liters water with tracking id: water-27057, id: "01FTRD14GH1NKEGYDFERW9GDD0" owned by the cleaner, event id: "01FTRD16GBKAM9J5FPNTB6F58D"
Mon Jan 31 16:46:12 CET 2022 - Transferred custody of 1 gown to hospital with note: Transfer gowns to hospital, event id: "01FTRD18SFCXQDYQAPKA3WVH1G"
Mon Jan 31 16:46:15 CET 2022 - Created process: Process Use Gowns, process id: "01FTRD1B58FDR6D0NN91SEBQMC"
Mon Jan 31 16:46:17 CET 2022 - Created event: work perform surgery, event id: "01FTRD1DHW3PECBZ47EMWSXDA5", action work 80 hours as input for process: "01FTRD1B58FDR6D0NN91SEBQMC"
Mon Jan 31 16:46:20 CET 2022 - Created event: accept use for surgery, event id: "01FTRD1FKSBRC41VFXD15S736W", action accept 1 gown as input for process: "01FTRD1B58FDR6D0NN91SEBQMC"
Mon Jan 31 16:46:25 CET 2022 - Created event: modify dirty after use, event id: "01FTRD1JJWJZMCS60RV89WD47N", action modify 1 gown as output for process: "01FTRD1B58FDR6D0NN91SEBQMC"
Mon Jan 31 16:46:29 CET 2022 - Transferred custody of 1 gown to cleaner with note: Transfer gowns to cleaner, event id: "01FTRD1QZ5GK19H8SGJ1AEQR66"
Mon Jan 31 16:46:33 CET 2022 - Created process: Process Clean Gowns, process id: "01FTRD1VJ6JPRTHSBMK5MZF91J"
Mon Jan 31 16:46:37 CET 2022 - Created event: accept gowns to be cleaned, event id: "01FTRD1ZJYJ7EJYB9KZYGFEWZ5", action accept 1 gown as input for process: "01FTRD1VJ6JPRTHSBMK5MZF91J"
Mon Jan 31 16:46:40 CET 2022 - Created event: consume water for the washing, event id: "01FTRD23V2AE9J3FCRT4W7V61J", action consume 25 liters water as input for process: "01FTRD1VJ6JPRTHSBMK5MZF91J"
Mon Jan 31 16:46:44 CET 2022 - Created event: consume soap for the washing, event id: "01FTRD26JCS2CZWEJYPCQZW2MJ", action consume 50 kg soap as input for process: "01FTRD1VJ6JPRTHSBMK5MZF91J"
Mon Jan 31 16:46:47 CET 2022 - Created event: modify clean after washing, event id: "01FTRD2AKQA7JYVC8PZVYQSVJT", action modify 1 gown as output of process: "01FTRD1VJ6JPRTHSBMK5MZF91J"
Mon Jan 31 16:46:47 CET 2022 - Doing trace and track gown: "01FTRD0V4FQ3DPNYYMW8W64B70"

The query to generate the trace and track info is the following (with id as above last line and recurseLimit:10):

query($id:ID!, $recurseLimit:Int!) {
                economicResource(id: $id) {
                    trace(recurseLimit: $recurseLimit) {...trace}
                    track(recurseLimit: $recurseLimit) {...track}
                }
            }
            fragment unit on Unit {
                id symbol label
            }
            fragment measure on Measure {
                hasUnit {...unit}
                hasNumericalValue
            }
           fragment spatialThing on SpatialThing {
                id
                name
                mappableAddress
                note
                geom
                lat alt long
            }
            fragment economicResource on EconomicResource {
                id
                resourceName: name
                note
                primaryAccountable {id name displayUsername}
                onhandQuantity {...measure}
                accountingQuantity {...measure}
                currentLocation {...spatialThing}
                trackingIdentifier\n            }
            fragment economicEvent on EconomicEvent {
                id
                action {id}
                provider {id name displayUsername}
                receiver {id name displayUsername}
                resourceQuantity {...measure}
                resourceClassifiedAs
                resourceInventoriedAs {...economicResource}
                note
            }
            fragment process on Process {
                id
                processName: name
                note
                inputs {...economicEvent}
                outputs {...economicEvent}
            }
            fragment track on ProductionFlowItem {
                ... on EconomicResource {__typename ...economicResource}
                ... on EconomicEvent {__typename ...economicEvent}
                ... on Process {__typename ...process}
            }
            fragment trace on ProductionFlowItem {
                ... on EconomicResource {__typename ...economicResource}
                ... on EconomicEvent {__typename ...economicEvent}
                ... on Process {__typename ...process}
}

The generated output is in this file.

When I examine the output I get: cat MP-demo.log | grep -v 'CET 2022' | jq '.trace[] | .id + " " + .__typename + " " + .note'

"01FTRD2AKQA7JYVC8PZVYQSVJT EconomicEvent modify clean after washing"
"01FTRD1JJWJZMCS60RV89WD47N EconomicEvent modify dirty after use"
"01FTRD0V4FQ3DPNYYMW8W64B70 EconomicResource "
"01FTRD23V2AE9J3FCRT4W7V61J EconomicEvent consume water for the washing"
"01FTRD26JCS2CZWEJYPCQZW2MJ EconomicEvent consume soap for the washing"
"01FTRD1ZJYJ7EJYB9KZYGFEWZ5 EconomicEvent accept gowns to be cleaned"
"01FTRD1VJ6JPRTHSBMK5MZF91J Process Cleaning process performed at CleanLease Eindhoven"
"01FTRD1DHW3PECBZ47EMWSXDA5 EconomicEvent work perform surgery"
"01FTRD1FKSBRC41VFXD15S736W EconomicEvent accept use for surgery"
"01FTRD1B58FDR6D0NN91SEBQMC Process Use process performed at OLVG"

cat MP-demo.log | grep -v 'CET 2022' | jq '.track[] | .id + " " + .__typename + " " + .note'

"01FTRD1FKSBRC41VFXD15S736W EconomicEvent accept use for surgery"
"01FTRD1ZJYJ7EJYB9KZYGFEWZ5 EconomicEvent accept gowns to be cleaned"
"01FTRD0V4FQ3DPNYYMW8W64B70 EconomicResource "
"01FTRD1JJWJZMCS60RV89WD47N EconomicEvent modify dirty after use"
"01FTRD1B58FDR6D0NN91SEBQMC Process Use process performed at OLVG"
"01FTRD2AKQA7JYVC8PZVYQSVJT EconomicEvent modify clean after washing"
"01FTRD1VJ6JPRTHSBMK5MZF91J Process Cleaning process performed at CleanLease Eindhoven"

I have the following questions/points:

  1. There is no required timestamp for the query, so there can be no backward (trace) and forward (track) reporting, as any point in time determining what is past and what is future is arbitrary (except to have timestamp=now, and track empty)
  2. Track and trace almost contain the same information, and track is a subset of trace (track missing the events concerning other resources , is this on purpose? 2.1. "01FTRD23V2AE9J3FCRT4W7V61J EconomicEvent consume water for the washing" 2.2. "01FTRD26JCS2CZWEJYPCQZW2MJ EconomicEvent consume soap for the washing" 2.3. "01FTRD1DHW3PECBZ47EMWSXDA5 EconomicEvent work perform surgery"
  3. Creation events are not traced (nor tracked), i.e. 3.1. Created 1 gown with tracking id: gown-14260, event id: "01FTRD0Y78GP23493FK30222BR" 3.2. Created 100 kg soap with tracking id: soap-27862, event id: "01FTRD12C0G0ZJH0QS636FCFR9" 3.3. Created 50 liters water with tracking id: water-27057, event id: "01FTRD16GBKAM9J5FPNTB6F58D"
  4. Transfer events are not traced nor tracked: 4.1. Transferred custody of 1 gown to hospital, event id: "01FTRD18SFCXQDYQAPKA3WVH1G" 4.2. Transferred custody of 1 gown to cleaner, event id: "01FTRD1QZ5GK19H8SGJ1AEQR66"

In general we need to define what should be in the Material Passport, at the moment we have events and processes that contain the same events.

denizenging commented 2 years ago

Hi, @sbocconi.

Thanks for the quick Jitsi meeting! <(^.^<)

There is no required timestamp for the query, so there can be no backward (trace) and forward (track) reporting, as any point in time determining what is past and what is future is arbitrary (except to have timestamp=now, and track empty)

From what we discussed on Jitsi, you seem to be looking for timestamp information on the records. The IDs used in the records conform to the standard ULID, which means they already contain timestamp information in the first 10 characters, encoded in Crockford's Base32. You can use them for what you need, but as we discussed, I'll implement timestamp fields on the records Process, EconomicEvent, EconomicResource for you. I'll notify you when I implement them.

Track and trace almost contain the same information, and track is a subset of trace (track missing the events concerning other resources , is this on purpose?

Yes, this is on purpose. I've explained the reason in our last Home Of The Tech meeting, but for the reference: This is because in our implementation, track and trace of a record returns all the associated records (EconomicResource, EconomicEvent, Process), which effectively makes them look like they both contain the same information. Have a look at https://www.valueflo.ws/algorithms/track/#track-and-trace-logic, and you'll see that the returned records conform to that resolving algorithm.

Creation events are not traced (nor tracked), i.e.

From what we discussed on Jitsi, you told me they were not wrapped in a Process. That might be the issue here. The algorithm requires a non-transfer/move events to be wrapped inside a Process.

Transfer events are not traced nor tracked

This looks strange. I'll have a look.

I hope my answers were helpful. Please don't hesitate to ask more questions if they are not clear.

Cheers!

fosterlynn commented 2 years ago

Track and trace almost contain the same information, and track is a subset of trace (track missing the events concerning other resources , is this on purpose?

Yes, this is on purpose. I've explained the reason in our last Home Of The Tech meeting, but for the reference: This is because in our implementation, track and trace of a record returns all the associated records (EconomicResource, EconomicEvent, Process), which effectively makes them look like they both contain the same information. Have a look at https://www.valueflo.ws/algorithms/track/#track-and-trace-logic, and you'll see that the returned records conform to that resolving algorithm.

@srfsh @sbocconi it doesn't sound right that you are seeing some number of the same records on both the track and the trace, unless you have a different starting point. @bhaugen has some working code he can point you to, which might help, since it includes more than the doc (specifically, the recursion logic). And we'll see what we can do to further document this feature. And I'm happy to take a look at your data and results with you to see if I can help spot anything.

Another note, just to make sure: This is the feature I was talking about in the graphql telegram chat, where I made some breaking changes to the naming: next and previous now should bring back one level forwards or back; track and trace should bring back the whole chain either forward or backward.

More coming later...

bhaugen commented 2 years ago

There is no required timestamp for the query, so there can be no backward (trace) and forward (track) reporting, as any point in time determining what is past and what is future is arbitrary (except to have timestamp=now, and track empty)

Tracing backwards and tracking forwards from a resource means tracing the incoming resource flows and tracking the outgoing resource flows from the resource you are starting from.

I tried to explain the difference between forwards and backwards in time vs forwards and backwards in resource flows here: https://lab.allmende.io/valueflows/vf-app-specs/vf-apps-traversing-the-flows/-/issues/6#note_39496

Those seem to be difficult concepts to explain and understand.

Another way to think about it is in terms of the purposes of tracking and tracing. Some examples:

A few years ago, "mad cow disease" affected several people in the UK who ate beef from cattle who had the disease. So the affected beef needed to be traced back to the source meat packing plants, slaughterhouses, feeding operations, and farms, to see where the disease originated. Turned out it came from feeding parts of diseased sheep to cattle. I did not make that up: https://www.cdc.gov/prions/bse/about.html#:~:text=BSE%20possibly%20originated%20as%20a,or%20scrapie%2Dinfected%20sheep%20products.

Another problem was coliform bacteria in lettuce, which was traced back to the farms it came from, and then tracked from there to everywhere any vegetables from that farm went so they could be recalled.

In 1982, somebody as yet unknown poisoned bottles of Tylenol in Chicago. The public health authority started trying to trace where they came from, and then track where all of the possibly poisoned bottles from that source might have gone to. The manufacture figured their reputation was shot anyway, and tracking tracing was too slow and uncertain, so they recalled all bottles of Tyenol everywhere in supply chains as well as peoples' houses.

Please let me know if that made sense, and helps your understanding.

bhaugen commented 2 years ago

Track and trace almost contain the same information, and track is a subset of trace (track missing the events concerning other resources , is this on purpose?

Yes, this is on purpose. I've explained the reason in our last Home Of The Tech meeting, but for the reference: This is because in our implementation, track and trace of a record returns all the associated records (EconomicResource, EconomicEvent, Process), which effectively makes them look like they both contain the same information. Have a look at https://www.valueflo.ws/algorithms/track/#track-and-trace-logic, and you'll see that the returned records conform to that resolving algorithm.

@srfsh @sbocconi it doesn't sound right that you are seeing some number of the same records on both the track and the trace, unless you have a different starting point.

I'd like to see the data you are working from and maybe the code you are using. I could probably figure it out from the outputs you posted but the data plus where you were starting from with tracking and tracing would speed the conversation.

bhaugen commented 2 years ago

@sbocconi I'm also interested in any details, code, screenshots, etc, about how you are using node-red. You can post on gitter if it seems off-topic here.

sbocconi commented 2 years ago

Hello @bhaugen, I have posted on Gitter the repository I used to generate the data in this issue, and I have added a Readme that hopefully helps to understand how to use the code.

Meanwhile the conversation about the Material Passport is still ongoing within Reflow.

fosterlynn commented 2 years ago

it doesn't sound right that you are seeing some number of the same records on both the track and the trace, unless you have a different starting point.

I might understand better how you would see this now. If your flow is linear, without branching, and you are executing trace from the end, and track from the beginning, you would get pretty much what you have described. If this is the case, please disregard my earlier comment and pardon the interruption.

But note, the order should be different, and proceed in the forward or backward direction as you would expect.

Also, I'll try for a picture to help describe expected results for track and trace, and we'll use that plus some of @bhaugen 's recent explanations to improve the doc.

sbocconi commented 2 years ago

I have generated another case where the track and trace should be clear and not ambiguous IMHO. The process is now to create cotton (action raise), and consume cotton to produce a gown in a process. The gown then undergoes transport, usage and cleaning as before.

this is the list of what happens on reflow-demo (the complete log is here)

Thu Feb 10 14:56:26 CET 2022 - Logged user hospital in, id: "01FT69JP6XX5ZF0XKDT960P7S9", token: "QTEyOEdDTQ.T9HlxlmXHS5Y9t_6fWr2U6iE_BDWFriZGMBP_eQHOK5de4R5uKVwnpPxk1g.U-WYVudGdY65B_cM.LOkFcDyBjt_C6PoV_EDbNMXFh029rSzMLPcZeuoiOZllnI6m19txL-rIqCyoih1D.KlssFNihkApjPjyObhmGUQ"
Thu Feb 10 14:56:27 CET 2022 - Logged user cleaner in, id: "01FT69JVNXZD48QANV2GHQP3ZY", token: "QTEyOEdDTQ.bz7HT7mLn5Z7HRPldfphX_bafRcde_WwEuNBIa8Y2B4sgJqHh3cSTdJtGCM.XQ2TGCG6Amiubk-K.7tSEa9FrM78Pf4l5u8nUvwYW32nWEoqMqFrAUi3z_ETCeuZVy1mcLoyTKo7E2wIU.spGnqIDGLaUWXXDPqmgjpg"
Thu Feb 10 14:56:27 CET 2022 - Reading units from file init_reflow-demo.dyne.org.json
Thu Feb 10 14:56:27 CET 2022 - Read location for OLVG, id: "01FTR5XKWQDV7MAKM0PR3776MW"
Thu Feb 10 14:56:27 CET 2022 - Read location for CleanLease Eindhoven, id: "01FTR5XNRF2JYDKR3A2D1WABJN"
Thu Feb 10 14:56:27 CET 2022 - Read unit for gowns, id: "01FTR5XR0P5R3N3GY2ZAQDJWZ3"
Thu Feb 10 14:56:27 CET 2022 - Read unit for mass (kg), id: "01FTR5XRQYFYZ733W3S574H50Q"
Thu Feb 10 14:56:27 CET 2022 - Read unit for volume (litre), id: "01FTR5XS47KJ55YZCYH0W8YFXV"
Thu Feb 10 14:56:27 CET 2022 - Read unit for time (hour), id: "01FTR5XSKAS697BBGKEMP9TX2B"
Thu Feb 10 14:56:30 CET 2022 - Created process: Process create soap, process id: "01FVHYQJ8S56TRBDF46TQ2KFV9"
Thu Feb 10 14:56:35 CET 2022 - Created 100 kg soap with tracking id: soap-11856, id: "01FVHYQN92MPNPPZTKAXVFM5R9" owned by the cleaner, event id: "01FVHYQPZT7DXC81YJFQ8ECN8S"
Thu Feb 10 14:56:37 CET 2022 - Created process: Process create water, process id: "01FVHYQSXA5CCJ0NGVAYQ2FVHK"
Thu Feb 10 14:56:42 CET 2022 - Created 50 liters water with tracking id: water-3763, id: "01FVHYQWCXS4HGPMYA035YK3HS" owned by the cleaner, event id: "01FVHYQXRESH0KY5RX37SGJJVD"
Thu Feb 10 14:56:46 CET 2022 - Created process: Process create cotton, process id: "01FVHYR0QAXF6A7VYCSVYTCEX3"
Thu Feb 10 14:56:51 CET 2022 - Created 10 kg cotton with tracking id: cotton-20851, id: "01FVHYR4P5XPABF0G5CVCG74ZA" owned by the cleaner, event id: "01FVHYR6NQ1JH6KNK41TZSQ3B5"
Thu Feb 10 14:56:54 CET 2022 - Created process: Process sew gown, process id: "01FVHYR0QAXF6A7VYCSVYTCEX3"
Thu Feb 10 14:56:58 CET 2022 - Created event: consume cotton for sewing, event id: "01FVHYRD08M9C6EWVXCR3QSBY6", action consume 10 kg cotton as input for process: "01FVHYR9HFZ2CXHAWENHWRGDPK"
Thu Feb 10 14:57:04 CET 2022 - Created event: produce gown, event id: "01FVHYRK5F4ZD5CTE1GFSR4J07", action produce 1 gown with tracking id: gown-1343, id: "01FVHYRGRXWZCM4TQPFS2N1VVW" owned by the cleaner as output of process: "01FVHYR9HFZ2CXHAWENHWRGDPK"
Thu Feb 10 14:57:08 CET 2022 - Transferred custody of 1 gown to hospital with note: Transfer gowns to hospital, event id: "01FVHYRPM2R5MEF7WC0EJR21V4"
Thu Feb 10 14:57:12 CET 2022 - Created process: Process Use Gown, process id: "01FVHYRTBMH25HZAW4TXWF3PMX"
Thu Feb 10 14:57:16 CET 2022 - Created event: work perform surgery, event id: "01FVHYRYBVECRKSBWA2MQQ7GYR", action work 80 hours as input for process: "01FVHYRTBMH25HZAW4TXWF3PMX"
Thu Feb 10 14:57:20 CET 2022 - Created event: accept use for surgery, event id: "01FVHYS20A758132DNQVDSBV6P", action accept 1 gown as input for process: "01FVHYRTBMH25HZAW4TXWF3PMX"
Thu Feb 10 14:57:23 CET 2022 - Created event: modify dirty after use, event id: "01FVHYS6DWNWCKP30GSDGMMJ0H", action modify 1 gown as output for process: "01FVHYRTBMH25HZAW4TXWF3PMX"
Thu Feb 10 14:57:27 CET 2022 - Transferred custody of 1 gown to cleaner with note: Transfer gowns to cleaner, event id: "01FVHYS91ZJ2C1ZV9N5D492EFS"
Thu Feb 10 14:57:30 CET 2022 - Created process: Process Clean Gown, process id: "01FVHYSCGSYB86NQBD9BCMS392"
Thu Feb 10 14:57:32 CET 2022 - Created event: accept gowns to be cleaned, event id: "01FVHYSG11TDJ0V1FAQ27MPE5B", action accept 1 gown as input for process: "01FVHYSCGSYB86NQBD9BCMS392"
Thu Feb 10 14:57:36 CET 2022 - Created event: consume water for the washing, event id: "01FVHYSJKGN7AJKM8ZXW6MWNKG", action consume 25 liters water as input for process: "01FVHYSCGSYB86NQBD9BCMS392"
Thu Feb 10 14:57:39 CET 2022 - Created event: consume soap for the washing, event id: "01FVHYSNJ2PC9KMF8EAXFXX1RB", action consume 50 kg soap as input for process: "01FVHYSCGSYB86NQBD9BCMS392"
Thu Feb 10 14:57:42 CET 2022 - Created event: modify clean after washing, event id: "01FVHYSS4QYTJXPK8C3GHZY9NP", action modify 1 gown as output of process: "01FVHYSCGSYB86NQBD9BCMS392"
Thu Feb 10 14:57:42 CET 2022 - Doing trace and track gown: "01FVHYRGRXWZCM4TQPFS2N1VVW"

when I inspect the result of the track and trace, I get this: Result from trace

"01FVHYRD08M9C6EWVXCR3QSBY6 EconomicEvent consume cotton for sewing"
"01FVHYR9HFZ2CXHAWENHWRGDPK Process Sew gown process performed by CleanLease Eindhoven"
"01FVHYRK5F4ZD5CTE1GFSR4J07 EconomicEvent produce gown"
"01FVHYS6DWNWCKP30GSDGMMJ0H EconomicEvent modify dirty after use"
"01FVHYSS4QYTJXPK8C3GHZY9NP EconomicEvent modify clean after washing"
"01FVHYRGRXWZCM4TQPFS2N1VVW EconomicResource "
"01FVHYS20A758132DNQVDSBV6P EconomicEvent accept use for surgery"
"01FVHYRYBVECRKSBWA2MQQ7GYR EconomicEvent work perform surgery"
"01FVHYRTBMH25HZAW4TXWF3PMX Process Use gown process performed at OLVG"
"01FVHYSG11TDJ0V1FAQ27MPE5B EconomicEvent accept gowns to be cleaned"
"01FVHYSNJ2PC9KMF8EAXFXX1RB EconomicEvent consume soap for the washing"
"01FVHYSJKGN7AJKM8ZXW6MWNKG EconomicEvent consume water for the washing"
"01FVHYSCGSYB86NQBD9BCMS392 Process Clean gown process performed at CleanLease Eindhoven"

Result from track

"01FVHYSG11TDJ0V1FAQ27MPE5B EconomicEvent accept gowns to be cleaned"
"01FVHYS20A758132DNQVDSBV6P EconomicEvent accept use for surgery"
"01FVHYRGRXWZCM4TQPFS2N1VVW EconomicResource "
"01FVHYSS4QYTJXPK8C3GHZY9NP EconomicEvent modify clean after washing"
"01FVHYSCGSYB86NQBD9BCMS392 Process Clean gown process performed at CleanLease Eindhoven"
"01FVHYS6DWNWCKP30GSDGMMJ0H EconomicEvent modify dirty after use"
"01FVHYRTBMH25HZAW4TXWF3PMX Process Use gown process performed at OLVG"

Trace results seem wrong to me, because:

bhaugen commented 2 years ago

@sbocconi thanks for all the details. This is a placeholder response, so you know I am looking and will try to respond more usefully after I study everything.

sbocconi commented 2 years ago

Using the above described example, I see (at least) two possibilities for the structure returned for trace (the ids are real so you can inspect the events etc on reflow-demo):

//  TRACE
// Option 1: the nesting follows the chain backwards, from last event which is outputOf Process -> Process -> event that are inputOf Process -> event which is outputOf Process -> Process etc

{
  "id": "01FVHYRGRXWZCM4TQPFS2N1VVW",
  "economicEvents": [
    {
      "id": "01FVHYRK5F4ZD5CTE1GFSR4J07",
      "note": "produce 1 gown with tracking id: gown-1343",
      "process": {
        "id": "01FVHYR0QAXF6A7VYCSVYTCEX3",
        "note": "Process sew gown",
        "economicEvents": [
          {
            "id": "01FVHYRD08M9C6EWVXCR3QSBY6",
            "note": "consume cotton for sewing",
            "economicEvents": [
                {
                    "id": "01FVHYR6NQ1JH6KNK41TZSQ3B5",
                    "note": "created 10 kg cotton with tracking id: cotton-20851",
                    "process": {
                        "id": "01FVHYR0QAXF6A7VYCSVYTCEX3",
                        "note": "Process create cotton",
                        "economicEvents": []
                    }
                }
            ]
          }
        ],
      }
    }
  ]
}
// Option 2: the nesting follows the chain backwards, encapsulating when possible the events in a process using In and Out events

{
    "id": "01FVHYRGRXWZCM4TQPFS2N1VVW",
    "Processes": [
      {
        "id": "01FVHYRK5F4ZD5CTE1GFSR4J07",
        "note": "produce gown",
        "economicEventsIn": [
            {
                "id": "01FVHYRD08M9C6EWVXCR3QSBY6",
                "note": "consume cotton for sewing",
                "process": {
                    "id": "01FVHYR0QAXF6A7VYCSVYTCEX3",
                    "note": "Process create cotton",
                    "economicEventsIn": [],
                    "economicEventsOut": [
                        {"id": "01FVHYR6NQ1JH6KNK41TZSQ3B5",
                            "note": "created 10 kg cotton with tracking id: cotton-20851"
                        }
                    ]
                }
            }

        ],
        "economicEventsOut": [
            {
                "id": "01FVHYRK5F4ZD5CTE1GFSR4J07",
                "note": "produce 1 gown with tracking id: gown-1343"
            }
        ]
    }
    ]
}
bhaugen commented 2 years ago

@sbocconi thanks again. I am wading thru all of those details, might have many picky questions. For example, in your MP-demo.log, I see event id: "01FVHYRK5F4ZD5CTE1GFSR4J07", action produce 1 gown with tracking id: gown-1343 ...but then I don't see that tracking id anymore.

Trace results seem wrong to me, because:

I should not see anything that happens to the gown, only how the gown came to be

Correct.

Also the raise event of the cotton should be there, because this is how the gown came to be

True, but I didn't see the raise event in your logs or options. Did you add a Process to create the cotton instead of a raise event? On line 14 of your MP-demo.log: https://github.com/reflow-project/Node-red/blob/main/MP-demo.log#L14

Overall, your option 2 seems more clear to me than option 1.

Is there a spec for what should go on a material passport? Or are you creating the spec through your current work as explained here?

Probably more comments to come...

sbocconi commented 2 years ago

Hi @bhaugen, thanks for looking at the issue!

Indeed I do not use the tracking id much in my script (also not for the water, soap and cotton resources), as I refer to the resource with the Reflow generated id.

This is just out of practical reasons, as I saw that id more used than the other one in the GraphQL API (this might be due to my quick checking how to make the calls without delving too much into the documentation).

This might change as I understand when the resource flows to a different "system" the id might change according to the convention of the new system, but for the time being this is not happening in the example.

The fact that you do not see a raise event in the log is due to the fact that I have wrapped the raise event in a syntactic sugar "create" call (for no particular reason), and therefore the logging is different. In fact each time you see a Created in the log that is a raise event under the hood. Each non-transfer related event is wrapped in a process according to the guidelines from Dyne.

And finally, we are trying to converge on what we would like to see in a Material Passport, so yes this is creating the spec with this discussion. That is very urgent as we have pilots that want to use the Material Passport and we need to feed a stable structure to the cryptography to have the Material Passport signed.

bhaugen commented 2 years ago

@sbocconi thanks a lot! @fosterlynn and I also just met with @srfsh and we'll meet with him again soon. This exercise is being very valuable to testing out the Valueflows documentation and finding out where it is unclear, and in some cases your use case is exposing things we need to improve in the spec. So thanks a lot to both of you, and we'll work with you to get things all cleared up asap.

Took me awhile to understand the actual flows in this example, but is this accurate?

Starting from the Resource Gown-1343 as received by the hospital, would the trace go:

[Gown-1343@Transfer]<-(Transfer)<-[Gown-1343@sew_gown]<-(Process:sew gown]<-etc`

...where the trace goes backwards from the starting point,and Resources are identified by their id@the last Process or Transfer that they came out of.

(In many situations, the trackingIdentifier would be used because the Resource.id will change, but in this case, if we understand correctly, your Resource keeps the same id throughout its life cycle, regardless of Agents and locations. Correct?)

Some of the complexity here is what Lynn very recently added to https://www.valueflo.ws/algorithms/track/ "Additional special logic": this example has the same Resource traveling through many Processes. (And we need to add Transfers to that logic as well as Processes.)

There are a few other additions and corrections that we will make to that logic description after we think we all understand this case.

sbocconi commented 2 years ago

Hello @bhaugen ,

I think there is some uncertainty (at least with me) as to where is the point in time that divides the trace record from the track record.

If we define the trace as to how something came to be, the transfer event is not part of it, as at that moment the gown already exists.

So strictly speaking the trace for me would be (if I get your formalism right):

[Gown-1343@sew_gown]<-[Process:sew gown]<-[Cotton-20851@consume_cotton]<-[Cotton-20851@raise]<-[Process:create cotton]

But of course it seems that the transfer is relevant to the information we want to provide, but then I think it is necessary to provide a timestamp, thereby defining everything that happened before it as domain of trace, and everything after as track.

This is particularly important in the case of circular loops in flows, as a resource goes through the same process many times, so what is the past and what is the future without a timestamp?

(In many situations, the trackingIdentifier would be used because the Resource.id will change, but in this case, if we understand correctly, your Resource keeps the same id throughout its life cycle, regardless of Agents and locations. Correct?)

Yes this is correct.

bhaugen commented 2 years ago

@sbocconi If you envision the flows for a Resource as a directed graph, with nodes for each of the related EconomicEvents and Processes, then usually people start a trace at a particular node or edge. So trace backwards means preceding edged and nodes, and track forwards means subsequent nodes and edges.

You could find those via a traversal from your starting node or edge.

For example, if you start with a medical gown that just got transferred to a hospital, what happened before the transfer? and what happened before that? etc.

The EconomicEvents do have properties dealing with timing, and those might be interesting, but the main relationships are those of the nodes and edges of the directed flow graph.

Does that make sense?

sbocconi commented 2 years ago

It does make sense in theory, but when I see a gown I can scan a QR code that it is stitched on it to get the tracking id.

All that I know is the tracking id and the current time. I do not know about nodes that the gown might have gone through.

At that moment if the system does not allow me to specify any time in querying track&trace information, the only assumption the system can make is that I want to know everything up until then, and everything is necessarily in the past, therefore the trace contains that information, while the tracking is empty.

The fact that internal ids somehow reflect the structure of the nodes is something that I need to first know, and then perform at least one call to know what those ids are, and only then I can have a track request when I know where the gown has been in the past, but why would I if I already know all the history?

Maybe I miss something here :-)

bhaugen commented 2 years ago

Thanks, @sbocconi - I think I understand your use case better.

Two questions:

  1. What do you want on the material passport? The details about how the gown was made. or everything that ever happened to it after it was made?
  2. Do you know what info will be on the QR code? If so, what info will it be?
sbocconi commented 2 years ago

Hi @bhaugen

  1. I think the material passport (just talking about the content, not the cryptographic part) should contain everything that is known, from production (how the gown was made) to usage (everything that ever happened to it after it was made), and therefore I was trying to understand how trace and track would work to apply to this case, since I do not want to find the same elements twice if they are reported in both. And ideally I would query the system with the ID from the QR-code, which is the one known to me.
  2. the QR code is just an ID that serves to enter the gown in the system and track it.
fosterlynn commented 2 years ago

@sbocconi thanks

I think the material passport (just talking about the content, not the cryptographic part) should contain everything that is known, from production (how the gown was made) to usage (everything that ever happened to it after it was made), and therefore I was trying to understand how trace and track would work to apply to this case, since I do not want to find the same elements twice if they are reported in both

I was thinking that you would use just the trace (backwards) for the material passport. So there wouldn't be overlap with track there. Does that seem right for your requirements? You could trace from "right now", which would get everything. But that would be a different data set next week, after it was washed again, say. Or if you need a consistent passport over time, you could pick a point from which to always trace. It doesn't matter from our side, you should be able to trace from any point in the flow - or at least from the resource itself (which implies "right now"), or a specific output event, which would imply the point when something happening to the gown was completed.

sbocconi commented 2 years ago

Hello @fosterlynn,

I agree that the content for the Material Passport would be a trace from "right now". And because the way it would work (basically you scan a tag and you are led to a webpage) it would always report all that is happened, so not be consistent if something has happened between two id scans.

The option to specify a particular point in time is interesting and possibly a nice-to-have I think, but at the moment that logic seems to be implicit in the use of particular ids, which are not directly exposed to the outer world.

What I mean is that the GraphQL call does not have a timestamp parameter and neither the event ids nor the internal ReflowOD ids that are generated when for example using toResourceInventoriedAs are visible from the id physically present on the item, which is a tracking id.

So in that case maybe a timestamp could help, or there should be a user interface that first fetches all the events and then ask the user from what point to trace back, but this might be redundant as trace from now has already all the information.

bhaugen commented 2 years ago

@sbocconi the medical gown use case exposed a bug in the track and trace logic described at https://www.valueflo.ws/algorithms/track/#track-and-trace-logic

The problem with using timestamps for sequencing resource flows is that we are working in distributed systems, and it can be difficult or impossible to sequence using timestamps in distributed systems. The standard way to do it is called causal ordering, which is the way the VF track and trace logic is described.

Here are some references where you can learn a lot more, or you can just believe temporarily that the problem is real.

We are working on a solution (to accomplish causal ordering without using timestamps), which will unfortunately require some changes to Valueflows EconomicEvent processing as well as to the logic for tracking and tracing if the same resource goes repetitively through a process with the same ProcessSpecification: in the case of the gowns, being cleaned many times during their life cycles.

(@fosterlynn says if we need to cheat somewhere and use timestamps as a tie-breaker, we'll do it, but will try to keep it as limited as possible.)

CC: @srfsh

fosterlynn commented 2 years ago

What I mean is that the GraphQL call does not have a timestamp parameter

I think you shouldn't consider yourself constrained by the current parameters you find on the graphql. It is not my place to say this probably, but I think with @srfsh we can make sure you have what you need.

bhaugen commented 2 years ago

@sbocconi I am adding a new issue to the Valueflows repository based on the medical gown use case. Here's the start of the issue: https://lab.allmende.io/valueflows/vf-app-specs/vf-apps-traversing-the-flows/-/issues/7

Next, I want to describe the medical gown use case. Somewhere I think I remember an overall description, maybe written by you? I'd like to quote from, and link to, that description.

Is this the best description? https://github.com/dyne/reflow-os/issues/3

This issue references the same use case: https://github.com/dyne/zenpub/issues/58

Anything better with just the overview of the gown resource flows? Otherwise I'll use those as the basis for a description of the use case.

sbocconi commented 2 years ago

Hi @bhaugen , your description in the comment of the issue here is correct (with the minor and uninfluential correction that the gowns are not made yet with recycled cotton afaIk).

Thanks for the pointers to causal ordering, I understand the general problem, but I am not clear why ValueFlows has a dependency on the ProcessSpecification for causal ordering iso for example event id or process id.

Events and processes encapsulating them in the gown case are created every time the gown goes through the loop and their ids are based on ULID (which in this case should also use a form of Lamport timestamp for distributed systems).

Or just following the chain from outputOf to inputOf?

bhaugen commented 2 years ago

the minor and uninfluential correction that the gowns are not made yet with recycled cotton afaIk

Thanks, but yeah, doesn't involve the repetitive processes, which are the issue that exposed the bug..

I am not clear why ValueFlows has a dependency on the ProcessSpecification for causal ordering iso for example event id or process id.

I can understand why that is not clear enough. I'll try to clarify that in my description, but maybe try to clarify for you first to see what explanation would make sense.

As you may know, Valueflows has 3 levels, Knowledge (in this case Recipe, the standards for what should happen), Plan (the schedule of what should happen now) , and Observation (what did happen). https://www.valueflo.ws/introduction/core/#levels-of-the-ontology

The ProcessSpecification is part of a Recipe, which is a network of Processes that should be followed generally in a community. Like a recipe for cooking, but can cover any activities, not just cooking.

Recipes sometimes have the same ResourceSpecification as both and input and an output of the same ProcessSpecification. In this use case, a Recipe for the gowns would include Wearing and Cleaning processes, where the same gown would go into the Wearing process clean, and come out dirty, and then go into the cleaning process dirty, and come out clean again.

So on the Recipe level, the Gown ProcessSpecification would appear 4 times in that sequence, and to differentiate for causally ordering the Recipe, you would need to know what stage of the process flow the gown was at: clean or dirty, which you would know from the last ProcessSpecification it came out of: Wearing or Cleaning. And the Wearing process would require a gown that had been cleaned, and likewise the Cleaning process would want a gown that had been worn.

Those same methods of causal ordering using ProcessSpecifications also work at the Plan level. (I think, although I have not tried to detail that yet: Material Passports need tracking and tracing logic, which is on the Observation level. I'll go through Plans in similar detail when we get the tracking and tracing solved.)

But ProcessSpecifications do not work at the Observation level (what happened), because the same gown resource will go through many iterations of Wearing and Cleaning, where each of those actual Processes would have the same ProcessSpecification. So in other words, the ProcessSpecification works for causal ordering in the Recipe and Plan, but not for real life, where the same sequence happens many times.

Does that start to make sense?

bhaugen commented 2 years ago

@sbocconi

Events and processes encapsulating them in the gown case are created every time the gown goes through the loop and their ids are based on ULID (which in this case should also use a form of Lamport timestamp for distributed systems).

Thanks for that explanation. I did not know that, and it makes a difference. I'll study and think about it and maybe change my story and recommendations accordingly.

bhaugen commented 2 years ago

@sbocconi

Events and processes encapsulating them in the gown case are created every time the gown goes through the loop and their ids are based on ULID (which in this case should also use a form of Lamport timestamp for distributed systems).

I'm reading https://towardsdatascience.com/understanding-lamport-timestamps-with-pythons-multiprocessing-library-12a6427881c6

Solutions to this problem consist of using a central time server (Cristian’s Algorithm) or a mechanism called a logical clock. The problem with a central time server is that its error depends on the round-trip time of the message from process to time server and back. Logical clocks are based on capturing chronological and causal relationships of processes and ordering events based on these relationships.

Which means we are back to the events and processes in the gown resource flows for causal ordering, right?

Are you using ULIDs based on Lamport timestamps in your current implementation? If so, what processes and events are you using to implement a logical clock?

(Did I understand ULIDs and Lamport timestamps correctly? Or if not, please help me understand better.)

sbocconi commented 2 years ago

Hi @bhaugen ,

Regarding your first explanation, I understand what you say at the Recipe and Plan level, but indeed I would not base the ordering of the Observational level (which in my understanding is about instances) on the ordering on the other levels (which in my understanding is about classes). After all plans and recipes go wrong in reality.

In my understanding I would base the ordering on the event and process sequence, which again in my understanding can be reconstructed from inputOf outputOf relations and maybe resourceInventoriedAs and toResourceInventoriedAs for related resources.

After reading your pointers, I thought that another possibility was to use the id of the events and processes that in Reflow are time based, and in case of distributed systems using also the algorithm that I saw here:

sending (id of inputOf event on system A)
# event is known
time = time + 1;
# event happens
send(message, time);

receiving (id of outputOf event on system B):

(message, time_stamp) = receive();
time = max(time_stamp, time) + 1;

The id of the outputOf event would be coming from system B, but I realise I do not know how Valueflows works when recording events on distributed systems, so maybe what I say does not make sense.

bhaugen commented 2 years ago

@sbocconi

Regarding your first explanation, I understand what you say at the Recipe and Plan level, but indeed I would not base the ordering of the Observational level (which in my understanding is about instances) on the ordering on the other levels (which in my understanding is about classes). After all plans and recipes go wrong in reality.

Yup! Usually in the first few events...which is why we will eventually need to implement reactive replanning...which is the effects on the plans of the diffs between the real events and the plans.

In my understanding I would base the ordering on the event and process sequence, which again in my understanding can be reconstructed from inputOf outputOf relations and maybe resourceInventoriedAs and toResourceInventoriedAs for related resources.

That was what we have done in previous implementations and wrote up in https://www.valueflo.ws/algorithms/track/#previous-and-next-logic

But I think that does not totally work for this interesting use case.

After reading your pointers, I thought that another possibility was to use the id of the events and processes that in Reflow are time based, and in case of distributed systems using also the algorithm that I saw here:

sending (id of inputOf event on system A)
# event is known
time = time + 1;
# event happens
send(message, time);

receiving (id of outputOf event on system B):

(message, time_stamp) = receive();
time = max(time_stamp, time) + 1;

Is that what you propose to implement in ULIDs with Lamport timestamps?

Or have you already implemented it? If so, I would love to see some examples based on this use case. Might be a better method than the one I have been working through.

The id of the outputOf event would be coming from system B, but I realise I do not know how Valueflows works when recording events on distributed systems, so maybe what I say does not make sense.

If the various processes in that use case are being executed in different nodes in a distributed network, which would happen in a Holochain implementation, and could happen in an ActivityPub implementation where each Agent has their own Pub, then the events involved in each process would be recorded on a different computer.

In Holochain, all of the events and processes would also be published on a Distributed Hash Table that connects all of the Agents in the network, assuming all of that use case is happening in the same economic network. In ActivityPub, the network would be represented by a Valueflows Scope, which we think should have its own Pub, and all of the events that reference the Scope would be forwarded to, and mirrored by, the Scope.

I am not sure how Reflow will organize their network. But does that make sense? And does it affect what you were thinking, or is all ok?

bhaugen commented 2 years ago

P.S. @sbocconi

the Observational level (which in my understanding is about instances) on the ordering on the other levels (which in my understanding is about classes).

Close. but maybe not quite, depending how you are thinking about it?

The Knowledge level is populated by Type Objects which are user-defined data that behaves like classes.

The Plan and Observation levels are populated by instances of those Type Objects: in the Plan, by schedules for events that have not happened yet,, and in the Observations, by records of what did happen.

bhaugen commented 2 years ago

@sbocconi would you have time to give me some feedback on my proposed solution to the bug I linked to upthread (the medical gowns use case with the repetitive process cycles)? And think about whether ULIDs with Lamport Timestamps would be better? And how you could or already did implement ULIDs with Lamport Timestamps?

Here's proposed solution: https://cryptpad.fr/pad/#/2/pad/view/EpvHaJRYYnpNFPX+We9EAmb52aHtZpfSfAQixMc88ZI/

I'm asking a few people for feedback before I publish it, and also doing some desk testing using on some data generated nicely by @srfsh .

bhaugen commented 2 years ago

@sbocconi I'm returning to this algorithm because I'm engaged in several other conversations about these topics and the ULID with a lamport timestamp came up again.

Did you try this algorithm? If so, how did it work for this use case?

After reading your pointers, I thought that another possibility was to use the id of the events and processes that in Reflow are time based, and in case of distributed systems using also the algorithm that I saw here:

sending (id of inputOf event on system A)
# event is known
time = time + 1;
# event happens
send(message, time);

receiving (id of outputOf event on system B):

(message, time_stamp) = receive();
time = max(time_stamp, time) + 1;

The id of the outputOf event would be coming from system B, but I realise I do not know how Valueflows works when recording events on distributed systems, so maybe what I say does not make sense.

It does make sense, I think. How Valueflows works in distributed systems will depend a lot on the distributed system, the nodes involved, and the messages, etc. Seems like this idea could be useful.

fosterlynn commented 2 years ago

@sbocconi @srfsh re. pack/unpack:

I truly apologize for this change at this stage of your deliverables, but we felt it was needed, based on working through the pack-unpack sequence for the hospital gowns, and then looking into other use cases that are similar. Instead of following the container itself, we are following a package that includes the container. At the same time, we changed the naming to make it more generalized for other use cases that need that pattern. Here's a picture, which I think is the easiest way to understand it: https://www.valueflo.ws/examples/ex-production/#pack-unpack.

We have been very conscious of avoiding breaking changes during this multi-year stage of several projects implementing VF. You are the only one implementing pack/unpack so far of that group of projects, and since trace isn't coded yet, it is fairly contained at the moment. So it seemed like we should do the improvement now. And I think it will make the trace easier for you. But, it is a breaking change.

As always, we're happy to work with you all as closely as you like to get this change plus the trace done in your timeframe. And thanks for the ongoing collaboration!

sbocconi commented 2 years ago

@sbocconi I'm returning to this algorithm because I'm engaged in several other conversations about these topics and the ULID with a lamport timestamp came up again.

Did you try this algorithm? If so, how did it work for this use case?

Hello @bhaugen, I have not tried this algorithm, I thought about using it when you started talking about the problem of time in distributed systems, and considering Bonfire uses ULID it seems that there could be something to integrate given that otherwise the ordering of the ULID might not make sense across systems.

This is something for the developers of Bonfire to consider, I am a user of that system in Reflow, but I do not contribute to its development.

bhaugen commented 2 years ago

@sbocconi thanks for your comment, Re

considering Bonfire uses ULID it seems that there could be something to integrate given that otherwise the ordering of the ULID might not make sense across systems.

Hmm, I did not know that Bonfire was using ULID. Will have to ask them how they did it. Do you know? Was it like the code you posted upthread?

I am looking at three different methods of causally ordering resource flows:

I want to corral all the VF dev cats into a discussion to agree on what to use, if possible. Looks like ULID might be better than my proposal and maybe the easiest to implement?

What do you think? I am not married to my proposal, it was a bet with @fosterlynn that I could figure out a way to causally order events without using timestamps.

bhaugen commented 2 years ago

I talked with the Bonfires. Their ULID implementation uses timestamps based on the server clock, which means, in a distributed system, you can't trust the ULIDs for causal ordering.

So I think either the breadcrumbs or versioning the resources on every change will be necessary. The breadcrumbs might be better for interoperability across different stacks.

sbocconi commented 2 years ago

Yes I agree, you cannot rely on the current implementation of ULIDs in Bonfire, that is why I thought that Lamport timestamps would be a nice addition to Bonfire, but that was a suggestion to the Bonfire developers.