matryer / moq

Interface mocking tool for go generate
http://bit.ly/meetmoq
MIT License
2k stars 127 forks source link

Is there a way to speed up the moq generation time? #178

Closed tomicooler closed 1 year ago

tomicooler commented 2 years ago

Hi!

Just wondering is there a way to speed up the moq generation time? Something like gomockhandler for mockgen, or committing the generated files and ensure that they kept up-to-date.

Currently in our project, running go generate ./... takes around 3-4 minutes, we have 68 instances of //go:generate moq in our codebase.

Thanks.

djui commented 2 years ago

We have 46 instances and have 69s. It's of course not ran often, but would still be nice to do it faster.

Mostly each mock generation should be able to be run in parallel. At least the option should be given, so that one can decide that for oneself.

breml commented 2 years ago

I am not sure, if this is really an issue with moq it self. If one executes go generate ./... it will traverse through all the packages in sequence and execute all the commands, that are listed as //go:generate directives (again in sequence). This is just how go generate works.

There has been a request to make it parallel, but this one has been declined: https://github.com/golang/go/issues/20520

See also https://github.com/golang/go/issues/20520#issuecomment-308545068 for a quote from go generate -help about the order of execution.

djui commented 2 years ago

Yes, saw the golang issue, but that doesn't seem to go anyway due to philosophy or constraint guarantees.

The idea here is to just not use moq with go generate which wouldn't allow parallelism, but rather an execution way to let moq itself traverse and generate.

I get that there are questions around how to declare filenames and interface names if they are different for each file, but maybe we can come up with a solution there.

djui commented 2 years ago

Also: I'm quite sure go generate itself is fast, so moq must take 69s/46 ~ 1.5s per mock generation which, if assumption is right, can't see as optimal.

djui commented 2 years ago

Some numbers from a 2.3GHz i9 8-Core macbook:

$ find . -name '*_mock.go' -delete ; time go generate ./...
        1.34 real         5.44 user         1.06 sys
        0.65 real         2.54 user         0.61 sys
        1.04 real         5.02 user         0.90 sys
        0.65 real         2.60 user         0.60 sys
        0.73 real         2.84 user         0.70 sys
        0.67 real         2.70 user         0.59 sys
        0.76 real         3.10 user         0.62 sys
        1.04 real         5.30 user         1.04 sys
        1.05 real         5.81 user         1.01 sys
        0.36 real         1.11 user         0.30 sys
        1.08 real         5.84 user         1.05 sys
        1.09 real         6.02 user         1.07 sys
        1.04 real         5.45 user         0.99 sys
        0.65 real         2.63 user         0.62 sys
        0.37 real         1.21 user         0.34 sys
        1.14 real         6.00 user         1.06 sys
        0.41 real         1.30 user         0.35 sys
        1.29 real         6.61 user         1.14 sys
        1.28 real         6.69 user         1.14 sys
        0.39 real         1.31 user         0.34 sys
        0.72 real         2.84 user         0.65 sys
        0.38 real         1.28 user         0.36 sys
        0.67 real         2.83 user         0.61 sys
        0.40 real         1.26 user         0.40 sys
        1.11 real         6.00 user         1.12 sys
        0.38 real         1.18 user         0.38 sys
        1.32 real         6.66 user         1.16 sys
        1.27 real         6.62 user         1.17 sys
        1.32 real         6.73 user         1.21 sys
        1.49 real         7.04 user         1.17 sys
        1.29 real         6.82 user         1.10 sys
        1.28 real         6.61 user         1.15 sys
        1.28 real         6.58 user         1.14 sys
        1.27 real         6.71 user         1.10 sys
        1.27 real         6.70 user         1.15 sys
        1.27 real         6.65 user         1.13 sys
        1.28 real         6.61 user         1.12 sys
        0.68 real         2.73 user         0.66 sys
        0.38 real         1.24 user         0.38 sys
        1.08 real         5.60 user         1.01 sys
        0.85 real         2.80 user         0.67 sys
        0.67 real         2.90 user         0.63 sys
        0.68 real         2.95 user         0.65 sys
        0.68 real         2.79 user         0.63 sys
        0.68 real         2.91 user         0.63 sys
        0.38 real         1.20 user         0.40 sys
go generate ./...  194.22s user 37.85s system 557% cpu 41.662 total

So that's 0.89s per mock.

djui commented 2 years ago

Also no significant changes with -skip-ensure or -fmt noop:

go generate ./...  208.86s user 40.56s system 523% cpu 47.619 total
go generate ./...  193.61s user 38.79s system 530% cpu 43.804 total
tigerquoll commented 2 years ago

@djui , Try using a conditional execution of moq based on comparative age of the mock source and the test file eg. //go:generate sh -c "test httpClient_generated_test.go -nt $GOFILE && exit 0; moq -out httpClient_generated_test.go . httpClient"

matryer commented 2 years ago

This is great. We'd have to dig into it to see where the time is being spent.

Worst case, you could have a script that you call that runs all the moq commands concurrently? (I've done this in the past, and just execute that script in the go:generate line.) Not the most platform agnostic solution, but it worked.

djui commented 2 years ago

I am doing the parallel part currently with a script, but yes, it has a few tradeoffs and less elegance.

As an exercise for the reader I will try to just instrument where time in the script is spent.

Maybe there are low hanging optimization fruits. But overall, it’s not the end of the world.

breml commented 2 years ago

My assumption is, that loading the package information (https://github.com/matryer/moq/blob/13aa0482dc83d813df2ae11a107bf1f697298d09/internal/registry/registry.go#L130) is the most expensive part, especially if moq is used in a rather large module / repository or if there are lots of dependencies. Normally during a go generate ./..., moq is executed multiple times (once for each interface that needs to be mocked). For every execution, moq needs to load the package information again from scratch, which is rather expensive operation, especially with all the information that is needed (packages.NeedName|packages.NeedSyntax|packages.NeedTypes|packages.NeedTypesInfo|packages.NeedDeps).

So a low hanging fruit might be, if currently there is more information loaded than necessary. If this is not the case, one optimization that I can think of is, that moq learns to generate multiple mocks in one run, such that the package information can be loaded just once and then be cached / reused. But I guess, this would be a major refactoring.

haunt98 commented 2 years ago

Can we use simple approach, as suggest in the article below. That moq only run if mock file is older than source file?

https://jonwillia.ms/2019/12/22/conditional-gomock-mockgen

djui commented 2 years ago

Approach 1: Performance

My assumption is, that loading the package information (

https://github.com/matryer/moq/blob/13aa0482dc83d813df2ae11a107bf1f697298d09/internal/registry/registry.go#L130

) is the most expensive part, especially if moq is used in a rather large module / repository or if there are lots of dependencies. Normally during a go generate ./..., moq is executed multiple times (once for each interface that needs to be mocked). For every execution, moq needs to load the package information again from scratch, which is rather expensive operation, especially with all the information that is needed (packages.NeedName|packages.NeedSyntax|packages.NeedTypes|packages.NeedTypesInfo|packages.NeedDeps). So a low hanging fruit might be, if currently there is more information loaded than necessary. If this is not the case, one optimization that I can think of is, that moq learns to generate multiple mocks in one run, such that the package information can be loaded just once and then be cached / reused. But I guess, this would be a major refactoring.

The assumption is correct. Most (almost all) time is spent in packages.Load. I can't see that moq would need less information in that step.

Approach 2: Batch

For moq to generate all mocks at once, I try to see how the command line would look like? e.g. how would moq know which interfaces to mock and into which output files to write? We currently use e.g. //go:generate moq -out repository_mock_test.go . Repository so either the command line would have to get all output file names, or generate them via pattern; same with package names and interface names.

Or moq uses a search function e.g. looking for //go:generate itself but there I can already see a lot of incompatibility bugs with the actual go generate behaviour.

Or moq uses its own annotation, e.g. //moq:generate then it would find (1) source file, (2) interface ("line after") and could generate (3) output file with a pattern (outsuffix=_mock_test[.go]).

(3) Could be a nice addition in general, to convert:

//go:generate moq -out repository_mock_test.go . Repository

into:

//go:generate moq -outsuffix _mock_test.go . Repository

Approach 3: MTime

Can we use simple approach, as suggest in the article below. That moq only run if mock file is older than source file?

https://jonwillia.ms/2019/12/22/conditional-gomock-mockgen

The problem with checking the modification time of the file containing the interface and the mock output file is that moq is not aware of the interface's filename (at least not until the syntax tree is parse which takes most of the run-time), if the approach is to modify moq to do that time check and not an external script.

The external script (or shell if) seems to work for me:

-//go:generate moq -rm -out client_mock.go . Client
+//go:generate sh -c "test $GOFILE -nt repository_mock.go && moq -rm -out client_mock.go . Client"

We could add a flag -in|-mtime to either explicitly state the source file or use the $GOFILE envvar while present during go generate runs, then moq could do the check. See https://github.com/matryer/moq/pull/179/files

breml commented 2 years ago

For Approach 2 I think we would need to rethink the way moq is used. In that scenario I would expect to only have a small number of go:generate lines throughout a large project, in most cases even only one call (to get the most out of the "caching" of the loaded package information). For the batch mode, I would expect some sort of configuration file for moq, which contains a list of all the mocks that are needed to be generated. This list would contain the necessary information like name of the interface, package/folder where the interface is located, destination file.

haunt98 commented 1 year ago

Hi, I just read a blog post from mockery maintainer to discuss about generating package mock not single file mock as here https://landontclipp.github.io/blog/2023/06/21/introducing-mockerys-new-packages-feature/.

Hope it helps