Closed ThierryGoubier closed 11 years ago
this looks very nice, I just had a look at the repo itself and it makes working with filetree and normal git tools much more fluent. BTW, this was the reason I never used filetree for the pharo-core export since there was simply too much noise to recognize any crucial information.
nice work
I'm now thinking that, with this code, I could remove the methodProperties.json file as well, since the Author and timestamp are recorded by the git commit.
Thierry
I haven't had a chance to review your code yet, but the methodProp erties.json file is only there to preserve Monticello meta data ...
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Sent: Monday, July 22, 2013 8:48:16 AM | Subject: Re: [filetree] Metadata less (#93)
| I'm now thinking that, with this code, I could remove the | methodProperties.json file as well, since the Author and timestamp | are recorded by the git commit. | Thierry | — | Reply to this email directly or view it on GitHub .
Oh. I believed it was used to record the timestamp and creator as recorded in the changes file ? Because there is also the properties.json file as well... But I don't think I should try to get rid of this one.
The methodProperties.json file records meta data that is part of the Monticello definitions ... Git records the commiter and commit data for each method (file) change, so it should be possible to reconstruct the method meta data from Git and elminate the methodProperties.json file.
The properties.json files manage FileTree information and should not/cannot be eliminated...
Dale ----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Monday, July 22, 2013 10:08:01 AM | Subject: Re: [filetree] Metadata less (#93)
| Oh. I believed it was used to record the timestamp and creator as | recorded in the changes file ? Because there is also the | properties.json file as well... But I don't think I should try to | get rid of this one. | — | Reply to this email directly or view it on GitHub .
Ok, the next two commits are without the method properties metadata. Now, the timestamp is exclusively the data stored by git upon the commit (i.e. not the real time the method is written, but when it's committed to the git repository).
The code is a bit rough, in that it still reads the methodProperties metadata if it is present, but it doesn't do anything with it. If you think it is Ok, I'll clean the code.
Very good! The committer and commit timestamp are perfectly adequate replacements...
I hope to be able to find time to look at what you've don in finer detail today (I worked from home last week, so today is my first day back in the office for a week and there are meetings and other things going on that are getting in the way:)
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Monday, July 22, 2013 12:18:34 PM | Subject: Re: [filetree] Metadata less (#93)
| Ok, the next two commits are without the method properties metadata. | Now, the timestamp is exclusively the data stored by git upon the | commit (i.e. not the real time the method is written, but when it's | committed to the git repository). | The code is a bit rough, in that it still reads the methodProperties | metadata if it is present, but it doesn't do anything with it. If | you think it is Ok, I'll clean the code. | — | Reply to this email directly or view it on GitHub .
Here it is.
Will be a lot more readable that way...
But it makes the changes look strange... Everything is marked green, I guess somewhere the timestamps are seen as different.
Oh, it required a reload because the timestamps in the image had to change.
To summarize:
Works for the working copy changes; now what about history change browsing?
Thierry,
For what its worth, this is the compromise that was used for the earlier implementations of metadataless filetree:) (and yes there were multiple implementations) ...
The main drawback to the metadataless approach is that you are abandoning the Monticello history (actual history and algorithm) so once you toss the Monticello history, it is difficult to reintegrate the package back into the mcz universe ... possible but more difficult ... and that is the reason that I settled on managing meta data: that way folks could try FileTree on for size without abandoning their Monticello history) ... OTOH, if you are using FileTree exclusively then it makes total sense to get rid of all the Monticello meta data in favor of the git meta data...
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Monday, July 22, 2013 1:47:56 PM | Subject: Re: [filetree] Metadata less (#93)
| Oh, it required a reload because the timestamps in the image had to | change. | To summarize:
| * when loading a package, all methods timestamps are set to the | package version timestamp, and not the timestamp of the method last | change committed to git. * however, it is still possible to query | git on a method by method basis for that "committed" timestamp, see | getVersionsForDefinition:. * But this is too slow to be done on all | methods of a package when loading it.
| Works for the working copy changes; now what about history change | browsing? | — | Reply to this email directly or view it on GitHub .
Yes the tools recognize "timestamp only differences" ... it's a tools thing and when I used the metadataless forms, I hacked the tools to not treat timestamps as differences...
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Monday, July 22, 2013 1:21:23 PM | Subject: Re: [filetree] Metadata less (#93)
| But it makes the changes look strange... Everything is marked green, | I guess somewhere the timestamps are seen as different. | — | Reply to this email directly or view it on GitHub .
@ThierryGoubier I'm going to hold off merging this pull request until I get pharo3.0 working by itself ... I know this doesn't have anything to do with Pharo3.0, but I'm probably going to have push a new version (or two) to ss3 before all's said and done, so I want to minimize churn ...
In my opinion:
changing a bit the Monticello tools to disregard timestamp only differences is correcting a bug in my opinion (now I understand why I see unnecessary changes at times).
And what I have done so far is good at recreating a Mcz-compatible history from the git repository, so there is not so much to loose.
Well, this will give me the time needed to try it out and finalize it before the final integration.
Ah good ...
BTW, I've just merged the pharo3.0_dev branch into the pharo3.0 branch as part of the release process ... when I'm done I'll update the pharo3.0_dev branch to match the head of the pharo3.0 branch so you're pull request will still be good...
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Monday, July 22, 2013 10:14:20 PM | Subject: Re: [filetree] Metadata less (#93)
| In my opinion: | changing a bit the Monticello tools to disregard timestamp only | differences is correcting a bug in my opinion (now I understand why | I see unnecessary changes at times). | And what I have done so far is good at recreating a Mcz-compatible | history from the git repository, so there is not so much to loose. | Well, this will give me the time needed to try it out and finalize it | before the final integration. | — | Reply to this email directly or view it on GitHub .
Ok, what is needed is a single change in MCMethodDefinition>>= for the metadata-less to work in the current tools; it is the only user of timestamp in the equality area.
I will submit that change into pharo (or in Monticello directly?).
Welll I don't know if the pharo folks will go for this change ... in Monticello it is significant because the author/timestamp can be changed independently of the source ... but it's worth a try:)
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Monday, July 22, 2013 10:32:50 PM | Subject: Re: [filetree] Metadata less (#93)
| Ok, what is needed is a single change in MCMethodDefinition>>= for | the metadata-less to work in the current tools; it is the only user | of timestamp in the equality area. | I will submit that change into pharo (or in Monticello directly?). | — | Reply to this email directly or view it on GitHub .
But why consider it significant if none others (MethodDefinition, all the rest) are using their timestamp? We'll see; I have created an issue.
I intend to do the merge soon .. I'm assuming the conflicts are there because of monticello version metadata conflicts ... if not I might have to have you resolve the conflicts ...
I notice that some of the packages in this pull request have been written with new monticello meta data? or is that an artefact of testing ... I guess I have a concern that for the FileTree packages I need to preserve the Monticello meta data and as long as we can continue to preserve the monticello meta data for FileTree packages I'm happy ...
Hum. At the moment, it recreates the history when loading the package, and when FileTree saves it, then the new metadata is written (i.e. the version becomes the git-extracted version history).
Ok. It seems by remerging (and solving one monticello.meta/version conflict again!) it goes to an utomatic merging mode and we may even have the travis CI status with it.
I have also changed the uuid generation to have the same uuid generated everytime the info is build for a package version (the uuid is a SHA1 hash of the git commit ID and the package name).
Now, what about an option of desactivating the generation of monticello.meta/version and methodProperties.json/ston in FileTree?
Looks like I'll need some help to debug the travis CI results...
@ThierryGoubier ... scan to the travis log and if the tests are run (no stack traces) there will be a tests results banner at the beginning and end of the file that looks like this;
**************************************************************************************
Results for #('BaselineOfFileTree') Test Suite
60 run, 55 passes, 0 skipped, 0 expected failures, 5 failures, 0 errors, 0 unexpected passes
**************************************************************************************
*** FAILURES *******************
MCGitFileTreeLoaderTest debug: #testTargetLoad.
MCGitFileTreeIssue33Test debug: #testWriteNRead.
MCGitFileTreeIssue33Test debug: #testLoad.
MCGitFileTreeLoaderTest debug: #testBaseLoad.
MCGitFileTreeIssue33Test debug: #testWrite.
**************************************************************************************
If there is no test result banner then an error occurred that prevented the running of the tests ... there are wide variety of possible errors: pharo site failing to deliver a proper download, github failing to provide a checkout, walkbacks loading code or executing scripts.
Each script should dump a little line like the following:
travis---->before.st
travis---->bootstrapMetacello.st
at the very beginning of the script so that you can tell which script is responsible for the output ...
Does this help?
There is a banner, with the failed tests so I suppose everything loaded fine (otherwise I would see more errors), but, it doesn't help that all those tests are green on my workstation.
Yeah, these are the tougher problems.
Then the next question is have you loaded the packages from scratch and still gotten green tests? If not, then there are some missing initializers, etc.
Also keep in mind that for a pull request, the test is run after the merge, so the test failures may indicate integration failures ... the way to test this is to merge the remote repository into your own repository, load from scratch and run tests ...
From the network graph I see that you've merged with Pharo2.0, but have you loaded into a fresh pharo2.0 download? ... I do not know how often those things change, but if they do, the builderCI job always gets the latest Pharo2.0...
Okay, I looked at your diffs and I see that you changed the method signature in MCGitFileTreeIssue33Test>>testLoad and there are three possible failure sites ... two involve hasPackage and one involves the method time stamps ... so if fresh reloads do not reproduce the results we have to resort to debugging the test remotely using Transcript show ... adding a Transcript show that prints the expected and actual time stamps should give a clue as to why the test is failing ... perhaps the behavior is peculiar to a particular version of git? ...
Hopefully the problem will show up as a missing initializer...
Dale
----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Wednesday, July 24, 2013 11:08:42 AM | Subject: Re: [filetree] Metadata less (#93)
| There is a banner, with the failed tests so I suppose everything | loaded fine (otherwise I would see more errors), but, it doesn't | help that all those tests are green on my workstation. | — | Reply to this email directly or view it on GitHub .
I often load a fresh pharo image, but I'm effectively not entirely sure I follow the same load pattern. If you say that a Transcript show: will appear, then I will try that, test by test to get an idea.
Couldn't be more direct. All timestamp tests fail in TestIssue33, and even the commit ID test I added (testRepository) fails.
I'll add more tracing on that.
Yeah, we need the timestamp from the definition itself, since it appears to failing the first assertion for timestamp tests ... Ugly, but necessary when it only fails on travis:(
Well, no chances it can work. I don't have the same issue33 repo as on Travis.
On travis, the git sha1 is 4058b81ec118a0acc9ca55714e53385278e2a908, and I have only one version in the history (Author: DaleHenrichs Time: 28 June 2012, 3:07:37 pm)
On my branch, as merged from the pharo 2.0 branch, I have: two versions in the history, the git sha1 are b36f0a2532aaa4c995108ff6cb1363a4297bbf59 and f8daa1be31418b4432cb52413261d2bdeb4bd8f2, DaleHenrich 16 June 2012 for both.
I think that if I create an issue33.2 repo and update the timestamps, then it should work through the merge. Now, let's look at the next test case.
Or does that point to a problem in the way the test repos are copied to travis ?
Same here.
I can add code in the tests to force an alternative checkup on the commitId available in Travis, but it doesn't make sense, isn't it?
Okay ... when travis does a run it does a merge of your branch and the pharo3.0 branch and does a commit .... I think the lesson here is that depending upon a SHA is not quite the same as a UUID ... if you were to checkout a specific commit of the test directory, you might have more luck, but presumably a change to a readme will cause the directory's SHA to be changed?
Dale ----- Original Message -----
| From: "Thierry Goubier" notifications@github.com | To: "dalehenrich/filetree" filetree@noreply.github.com | Cc: "Dale Henrichs" dale.henrichs@gemtalksystems.com | Sent: Wednesday, July 24, 2013 1:55:58 PM | Subject: Re: [filetree] Metadata less (#93)
| Same here.
| * ver01 should be 65e9904 , on travis it is 4058b81 . | * ver02 should be 8d225c9 , on travis it is 4058b81 .
| I can add code in the tests to force an alternative checkup on the | commitId available in Travis, but it doesn't make sense, isn't it? | — | Reply to this email directly or view it on GitHub .
Ok, if it merges against pharo3.0 instead of 2.0 I may be in trouble.
CommitID is specific to the package, not to anything in the surroundings. If the Readme is changed, its counted as a new version of the package.
I've added the nonsense git sha1 from Travis as special timestamps and commits to try to get the tests to go green. I'm done for tonight.
Perhaps a different testing strategy is called for? Presumably you don't really care about the UUID of the packages and basing the UUID on the SHA of the commit should be enough. The UUIDs should be equal if a package is loaded from two different repositories checking out the same commit ... so that is what is important ... so perhaps we shouldn't care about the UUID so much (other than to validate that two different checkouts of the same commit give the same UUID ... for the method time stamps the (unfortunate) right answer is to use the timestamp of the last commit of that package .... then at least you will have stable timestamps ... which I think are important and more correct ...
Fresh start in morning:)
Dale, I don't test against the package UUID, just against the git data (commitID, author and date). All that you describe as should be done is the way gitfiletree is doing it (package uuid is sha1 hash of git commit id + package name, timestamp is git commit timestamp, author is git commit author).
I strongly believe travis makes a mess of the merge with the test data, and that I am unable to get anything reasonable out of my git queries.
I know the culprit: Travis L4, git clone -depth=50. This cuts short the git history and removes the commits I was checking for: f8daa1be31418b4432cb52413261d2bdeb4bd8f2, 65e9904e0ce9608d9e5867bb7e4dc7f37b411e13, 8d225c9b0158aed04882b0836958672ba8204bf4.
I leave it to you to change the travis CI script and revert this request to 523c7890d33047cbf3933ec3eb1e687d2ae6b7b1, or keep as it is now since it has additional tests to detect a future git copy mistake, all tracing code has been removed in 62c14ae953d126d896c71dbcc51176fb5bd2df79 (but I would suggest changing the ci script, otherwise those tests will start failing in a year or so...).
Added builderCI issue for the clone depth
bit ... I think parameterizing the depth makes sense since very few tests need to copy the whole repo
Thanks! It's a relief to have been able to solve it.
Yeah, that was a great bit of sleuthing!
Dale,
here is the metadata-less version of gitfiletree.
To summarize the features:
It does change the way the repository contents look like (particularly history, version numbers and author), so if you can validate the way it looks (or if others may have a look as well), I'd be happy to have your return.