Closed majk-p closed 1 month ago
I'm thinking to do both:
I agree with Jakub.
Sounds good, I can raise a PR to address this
Working on my initial problem I've noticed an additional issue around pathFormat
implicit val pathFormat: JsonFormat[os.Path] =
BasicJsonProtocol.projectFormat[os.Path, HashFileInfo](
p => {
if (os.isFile(p)) FileInfo.hash(p.toIO)
else
// If the path is a directory, we get the hashes of all files
// then hash the concatenation of the hash's bytes.
FileInfo.hash(
p.toIO,
Hash(
os.walk(p)
.map(_.toIO)
.map(Hash(_))
.foldLeft(Array.emptyByteArray)(_ ++ _)
)
)
},
hash => os.Path(hash.file)
)
It assumes all paths will exist, which is not always the case. In one of my sub-modules I noticed the resource_managed
was not present, which silently causes the cache miss and code regeneration. Since we're about to exclude hashes of the outputs (both sources and resources) the issue should disappear for now, but this may require some more thought in future.
The paths are checked for existence first:
val inputFiles =
(inputDirs ++ generatedFiles)
.filter(_.exists())
.toList
but only for the inputs 😅
TL;DR
Can we please change:
to something like:
Or exclude
codegenArgs.output
from cache altogether?Problem
I'm working on an sbt plugin that uses smithy4s, but also generates some of it's own managed sources. Smithy4s produces files to smithy4sOutputDir which by default is defined as
smithy4sOutputDir := (config / sourceManaged).value / "scala"
. This value is then later used by Smithy4sCodegenPlugin as a part ofcodegenArgs
used to produce files. The entireCodegenArgs
object is used as a cache key and thus theoutput
counts in. That wouldn't be a problem because the path would just remain cached as aString
, but smithy4s also overrides the json format for os.Path in a way that calculates the hash of the entire directory content. It isn't a problem as long as smithy4s is the only source generator. When another plugin adds managed sources the hashes change and we stop hitting cache. In some cases this might lead to repeated very long compilation times.Perhaps it's worth reconsidering if smithy4s codegen cache should rely on the contents of
source_managed
andresource_managed
at all, since it seems easy to invalidate the cache.It's also not obvious to the end user why the cache got invalidated. In my debugging case I would be very happy if the plugin told me why it decided to rerun codegen even though smithy sources remained unchanged.
Solution
I think adding another segment in the path should solve this and save fellow sbt plugin authors some headache. This way other plugins can either use the root
sourceManaged / scala
directory or anything different thansourceManaged / scala / smithy4s
.The other option is to modify
JsonConverters.codegenArgsIso
in a way that it doesn't serializeoutput
.