Closed xuannv112 closed 1 year ago
what go/garble version could we use to avoid this issue?
> $ garble run *.go [±master ●]
# command-line-arguments
/home/m/.cache/garble/tool/link: running gcc failed: exit status 1
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.xdOu7h.swNWWA':
go.go:(.text+0x125347): undefined reference to `oUEfqo.iNRMZd'
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.iU5VGNz74.aQw4UsJmR7':
go.go:(.text+0x1256eb): undefined reference to `oUEfqo.iNRMZd'
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.xGBWdJXQaj6v':
go.go:(.text+0x1257f2): undefined reference to `oUEfqo.iNRMZd'
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.(*pPSrIql).nOIY6ItdS':
go.go:(.text+0x12f9c1): undefined reference to `oUEfqo.vXtJ6QlKb'
collect2: error: ld returned 1 exit status
exit status 2
exit status 1
There are some temporary solutions 1/ Use GOGARBLE to specify which package you want to obfuscate -> I prefer this way 2/ Downgrade to go1.20.x (x<5), garblev0.9.3 (actually you only need to use the master branch but before this commit) 3/ For go-spew package, you can build with the safe tag -tags safe
@xuannv112 sounds like you have identified an issue and have a rough idea how to fix it, happy to review your PR. See CONTRIBUTING.md to get started, if you get stuck somewhere feel free to ask for help in our slack channel.
Note that we can't simply remove the check at https://github.com/burrowers/garble/blob/v0.10.0/reflect.go#L432, because the following line records the type as pkg.Path() + "." + obj.Name()
, which assumes top-level declarations.
My change in https://github.com/burrowers/garble/commit/0f2b59d79404f5be6725c24170fa2fad94e2b7fb was slightly flawed, as can be seen by this regression. When obfuscating a package, we still need to temporarily record what local types aren't obfuscated, even if those don't matter when obfuscating importers.
We could consider reverting, but the old code was also flawed and it would lead to bad obfuscation depending on the state of the cache.
I think it's probably time to rewrite recordedObjectString
; it has a couple of hacks and TODOs, so we already knew that it wasn't great :) I'll take a crack at it tonight, since I think that's probably important enough for v0.10.1.
Agree that we should check if the local types should be obfuscated or not. (using reflection and being used in an embedded struct). For a simpler solution, I think we can return this (use file-level instead of package level (if remove))
if pkg.Scope() != obj.Parent() { pos := fset.Position(obj.Pos()) return fmt.Sprintf("%s.%s - %s", pkg.Path(), obj.Name(), filepath.Base(pos.Filename)) }
It means in the same file, all local types with the same obj name will not be obfuscated. I think it is acceptable.
Or we can check in the object's scope if there are any structs that used the object as an embedded struct.
getObjId := func(pkg *types.Package, obj types.Object) string { pos := fset.Position(obj.Pos()) return fmt.Sprintf("%s.%s - %s:%d", pkg.Path(), obj.Name(), filepath.Base(pos.Filename), pos.Line) } if pkg.Scope() != obj.Parent() { // check if obj is a field of an embedded struct by lookup it's scope if _, ok := obj.(*types.TypeName); ok { scope := obj.Parent() for _, objName := range scope.Names() { otherObj := scope.Lookup(objName) if structType, ok := otherObj.Type().(*types.Struct); ok { for i := 0; i < structType.NumFields(); i++ { field := structType.Field(i) if field.Name() == obj.Name() && field.Embedded() { return getObjId(pkg, obj) } } } } } return "" }
So we will ignore only local types used in embedded field (without impact to other obj)
Definitely don't do that - that sort of double loop linear search will be very expensive in this part of the codebase.
I think it's okay, for now, to change recordedObjectString to do the same for local types that it already does for struct fields. It's a hack, but it should work.
I have moved the embedded field check logic to only ri.recordUsedForReflect
function to save cost.
https://github.com/burrowers/garble/pull/767
Here is the benchmark result:
go test -run=- -bench=. -count=6 -benchtime=1x goos: darwin goarch: amd64 pkg: mvdan.cc/garble cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ Build-12 24.84 ± 9% 25.20 ± 6% ~ (p=1.000 n=6) │ old.txt │ new.txt │ │ bin-B │ bin-B vs base │ Build-12 5.642Mi ± 0% 5.643Mi ± 0% +0.02% (p=0.002 n=6) │ old.txt │ new.txt │ │ cached-sec/op │ cached-sec/op vs base │ Build-12 611.3m ± 36% 723.0m ± 41% +18.28% (p=0.041 n=6) │ old.txt │ new.txt │ │ mallocs/op │ mallocs/op vs base │ Build-12 34.54M ± 0% 36.59M ± 0% +5.93% (p=0.002 n=6) │ old.txt │ new.txt │ │ sys-sec/op │ sys-sec/op vs base │ Build-12 17.25 ± 10% 17.13 ± 5% ~ (p=0.818 n=6)
I think i can optimize a litte bit by scanning only when there are 2 conditions: reflection + embedded field. It is better when trigger by only 1 condition: reflection
I have updated the logic: checking the embedded fields. No need for linear search at all. Here is the benchmark result
goos: darwin goarch: amd64 pkg: mvdan.cc/garble cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ Build-12 24.84 ± 9% 25.26 ± 8% ~ (p=0.937 n=6) │ old.txt │ new.txt │ │ bin-B │ bin-B vs base │ Build-12 5.642Mi ± 0% 5.648Mi ± 0% +0.11% (p=0.002 n=6) │ old.txt │ new.txt │ │ cached-sec/op │ cached-sec/op vs base │ Build-12 611.3m ± 36% 592.4m ± 16% ~ (p=0.937 n=6) │ old.txt │ new.txt │ │ mallocs/op │ mallocs/op vs base │ Build-12 34.54M ± 0% 36.60M ± 0% +5.94% (p=0.002 n=6) │ old.txt │ new.txt │ │ sys-sec/op │ sys-sec/op vs base │ Build-12 17.25 ± 10% 17.52 ± 8% ~ (p=0.485 n=6)
Hang on, but why do we need the extra logic? As far as I know, the %s.%s - %s:%d
hack will work just fine for local types just like it already does for fields. We don't need the extra code, I'm fairly sure.
For this use case
type Tqw int var T11 struct { A Tqw y Tqw a Tqw } _ = reflect.ValueOf(T11).FieldByName("A")
Tqw
is still obfuscated as the old logic.
1/ My PR will not obfuscate Tqw
only when Tqw
is used as an embedded field
2/ If there is no extra logic and simply return %s.%s - %s:%d
, for normal type (not used as an embedded field), it will not be obfuscated as the old logic.
It is a minor case, but I want to keep the old logic unchanged if the cost is acceptable.
What version of Garble and Go are you using?
What did you do?
Here is the sample code for obfuscation
What did you expect to see?
MyInt type shouldn't be obfuscated.
What did you see instead?
Notes:
1/ It is the same issue when using garble with go spew package go-spew
We obfuscate t0 field so vt0 returns 0 and panic later in the init function.
panic("reflect.Value read-only flag has changed semantics")
2/ Solution I think when using reflection, we can record non top-level objects as well. So is it ok to remove this logic? No need to compare pkg.Scope() vs obj.Parent() Check top level objects
The tradeoff here: There is no obfuscation even if there is no embedded struct when using reflection