Closed ericzundel closed 9 years ago
I added an explicit exception in this case
FAILURE: Jar IvyArtifact(path='/Users/square/.ivy2/pants/joda-time/joda-time/jars/joda-time-2.8.jar', classifier=None) in 3rdparty:joda-time.joda-time not resolved to the ivy symlink map in conf default.
and debugging to print out the name of the the ivy report:
DEBUG] Parsing ivy report /Users/square/.ivy2/pants/internal-da71ce64e94a8ba548d1ba3b68c0103b301e0206-default.xml
DEBUG] Parsing ivy report /Users/square/.ivy2/pants/internal-da71ce64e94a8ba548d1ba3b68c0103b301e0206-sources.xml
DEBUG] Parsing ivy report /Users/square/.ivy2/pants/internal-da71ce64e94a8ba548d1ba3b68c0103b301e0206-javadoc.xml
This report has joda-time at 2.8:
<caller organisation="internal" name="da71ce64e94a8ba548d1ba3b68c0103b301e0206" conf="default" rev="2.8" rev-constraint-default="2.8" rev-constraint-dynamic="2.8" callerrev="latest.integration"/>
<artifacts>
<artifact name="joda-time" type="jar" ext="jar" status="no" details="" size="621926" time="0" location="/Users/square/.ivy2/pants/joda-time/joda-time/jars/joda-time-2.8.jar">
<origin-location is-local="false" location="https://nexus.corp.squareup.com/content/groups/public/joda-time/joda-time/2.8/joda-time-2.8.jar"/>
</artifact>
</artifacts>
But our 3rdparty/BUILD file has the right version:
jar_library(name='joda-time.joda-time',
jars = [
sjar(org='joda-time', name='joda-time', rev='2.8.1',)
],
)
Check the date on this report:
$ ls -l /Users/square/.ivy2/pants/internal-da71ce64e94a8ba548d1ba3b68c0103b301e0206-default.xml
-rw-r--r-- 1 square staff 219161 Jun 23 15:43 /Users/square/.ivy2/pants/internal-da71ce64e94a8ba548d1ba3b68c0103b301e0206-default.xml
compared to the date on when the change was made to the source of the dependency version (we generate our BUILD files from pom.xml files):
3b3b002ee9 parents/base/pom.xml (xxxx xxxxx 2013-04-22 16:10:49 -0700 457) <dependency>
bc0fff6715 base-pom.xml (xxxxx xxxxxxx 2012-07-11 22:19:35 -0700 458) <groupId>joda-time</groupId>
bc0fff6715 base-pom.xml (xxxxx xxxxxxx 2012-07-11 22:19:35 -0700 459) <artifactId>joda-time</artifactId>
c582b660f8 parents/base/pom.xml (xxxxx xxxxxxxxxx 2015-06-24 14:33:43 -0700 460) <version>2.8.1</version>
bc0fff6715 base-pom.xml (xxxxxx xxxxxxx 2012-07-11 22:19:35 -0700 461) </dependency>
We wrap JarDependency with our own plugin 'sjar()' that provides global excludes.
It is a subclass of JarDependency, so I thought it would be OK, but I wonder if that is messing up the ivy caching?
I changed the sjar() to just jar() and it made no difference - pants is still trying to use the old ivy report.
I don't think your plugin is the problem. I have seen this error many times when moving android off codegen and then adding JarDependency to the new unpack_libraries code and have even done some (mostly fruitless) digging myself.
I think I know what the problem is based on the evidence above and reading the invalidation code in IvyTaskMixin. These ivy files are named with the sha of all the target names. It is depending on the invalidation logic to determine when to re-resolve and overwrite the file with updates. But invoking another target could cause ivy to fetch a new version of a jar_library. If this is done after a clean-all, the old symlink will be gone . Then going back to the original target there is an old report file that matches the name and all of the targets are valid, but the report was generated before one of the revs was modified on one of the jars.
That is my theory, but I haven't been able to reproduce it yet. I think the worst part of this bug is that it may NOT fail, but leave some targets silently linking against the wrong version of a library.
On Sat, Jun 27, 2015, 5:18 PM Mateo Rodriguez notifications@github.com wrote:
I don't think your plugin is the problem. I have seen this error many times when moving android off codegen and then adding JarDependency to the new unpack_libraries code and have even done some (mostly fruitless) digging myself.
— Reply to this email directly or view it on GitHub https://github.com/pantsbuild/pants/issues/1747#issuecomment-116144315.
Note that I've visited a similar problem before, see https://rbcommons.com/s/twitter/r/2015/, github issue #1350
I'm still working to reproduce this issue outside of our environment. But based on what I'm seeing, I think the solution to it is to change the way the sha is computed for the name of the internal-<sha>-<conf>.xml
files which would cause us to re-run ivy because resolved-internal-<sha>-latest-integration.xml
file would be missing in this case:
In our tree we have 3 targets that are closely related that exhibit the problem. I have a script that will reproduce the problem in our repo.
The logic of the script is essentially:
rm all internal files under ~/.ivy2/pants
./pants clean-all
# edit 3rdparty/BUILD and set joda-time version to 2.8
./pants resolve.ivy target-a::
./pants clean-all
# edit 3rdparty/BUILD and set joda-time version to 2.8.1
./pants resolve.ivy target-b::
./pants resolve.ivy target-c::
./pants resolve.ivy target-a:: # Reproduces the problem
I have tried creating similar logic using the pants repo and thus far not succeeded in reproducing it there.
Here is a patch that will change the way the ivy sha is computed:
diff --git a/src/python/pants/backend/jvm/ivy_utils.py b/src/python/pants/backend/jvm/ivy_utils.py
index c6e19f5..a958e53 100644
--- a/src/python/pants/backend/jvm/ivy_utils.py
+++ b/src/python/pants/backend/jvm/ivy_utils.py
@@ -247,11 +247,7 @@ class IvyUtils(object):
@staticmethod
def identify(targets):
- targets = list(targets)
- if len(targets) == 1 and targets[0].is_jvm and getattr(targets[0], 'provides', None):
- return targets[0].provides.org, targets[0].provides.name
- else:
- return 'internal', Target.maybe_readable_identify(targets)
+ return 'internal', Target.invalidation_hashes(list(targets))
@classmethod
def xml_report_path(cls, targets, conf):
diff --git a/src/python/pants/base/target.py b/src/python/pants/base/target.py
index 297cedc..d66c726 100644
--- a/src/python/pants/base/target.py
+++ b/src/python/pants/base/target.py
@@ -166,6 +166,11 @@ class Target(AbstractTarget):
return cls.maybe_readable_combine_ids([target.id for target in targets])
@staticmethod
+ def invalidation_hashes(targets):
+ """Generates an invalidation hash for a collection of targets."""
+ return hash_all([target.invalidation_hash() for target in sorted(targets)])
+
+ @staticmethod
def combine_ids(ids):
"""Generates a combined id for a set of ids."""
return hash_all(sorted(ids)) # We sort so that the id isn't sensitive to order.
The idea behind the cache is that this will incorporate the version number requested by the JarDependency since the names are computed from the invalidation hash, not just the target names.
This fixes the problem in my repro case, but its going to lead to a proliferation of internal*.xml
and resolved-internal-*(properties|xml)
files in the ivy cache dir. To solve that we might move these files out of the ivy cache and into .pants.d after running ivy resolve. This adds complexity and would lead to more work during a clean-all, so I'm not sure how good of an idea that is.
I'm still looking for a repro case outside of our repo. The 'target-a' I used is very complex:
./pants depmap samsa/bridge | wc -l
1564169
I committed the debugging in issue #1748 (commit aedb181).
So, in order to trigger this bug, we have to have 3 things 1:
if report_missing or invalidation_check.invalid_vts or not os.path.exists(raw_target_classpath_file):
args = ['-cachepath', raw_target_classpath_file_tmp] + (custom_args if custom_args else [])
self.exec_ivy(
...)
1) There must be a resolved-internal*.xml
report in ~/.ivy2/pants/ for the same set of target ids (run before a change of jar revision and before ./pants clean-all
)
2) The same jars must have already been resolved in ivy resolve so that global_vts.invalid_vts
is an empty list
3) There must be an existing raw_target_classpath
file in the .pants.d directory. The sha for this is derived from the cache_key created from the cache_keys for all the VersionTargetSets in global_vts.
I can understand how we might trick 1) by resolving the jars before the version change. I understand how we can trick 2) by re-resolving a superset of the targets for target-a in the example above.
I'm working on how to trick 3)
See https://rbcommons.com/s/twitter/r/2435/
Fixed in commit Commits dd78698ca and 7b07716
Periodically we see a KeyError raised from pants (although the exception type isn't printed). We see this in our build of the world script that crawls the repo and runs pants commands on every directory.
We've see this happen with different artifacts. The solution is to remove the directory from under ~/.ivy2/pants/
The first problem is that there is very little information reported.
The exception stacktrace is:
Here is the code from that rev:
In this case, the version of joda-time was updated from 2.8 to 2.8.1 the day before. Our repo only references revision 2.8.1
The ivy cache contains:
The symlink cache contains:
I turned on -ldebug and didn't see much (I've saved a copy in ~/pants/idea-ivy-resolve-key-error.log on my laptop)
Of note, the cache here has a lot of old internal-.xml files and a lot of 'resolved-.xml' files.
The reports are showing version 2.8.1 properly resolved. That would explain why it isn't in the symlink map. I'll follow up after modifying pants to get more debugging information.