oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.34k stars 745 forks source link

revision handling in CVS history log parser should be more robust #824

Open conuil opened 10 years ago

conuil commented 10 years ago

Hi,

I have a file that was checked in but when I go to the history the Revision Column has the word "check" instead of the version number. If I click on check it tells me its a binary file. If I download the file its blank. The file is a code file for FAME so pretty much text. We have plenty of other similar files that work fine.

I've checked the CVS ,v files, the Entries file and Opengrok's versions of these files, all seem fine. If I click on Annotate I can see the most recent changes with the correct version number e.g. 1.59.

I also ran a cvs status report and the file shows up as Up-to-date along with all the others.

Any ideas? Is it possible to debug opengrok? Is there somewhere else opengrok stores information on files/versions?

I'm working in unix, with opengrok version 11 Thanks

Emmet

vladak commented 10 years ago

Please go to 0.12.1 first and retry.

vladak commented 10 years ago

Also, the log of the indexer should contain the cvs log command which was used for fetching the history of the file, you can try to rerun the command to see what it produces.

vladak commented 10 years ago

The actual command for fetching the history for given file can be seen in getHistoryLogExecutor() (https://github.com/OpenGrok/OpenGrok/blob/master/src/org/opensolaris/opengrok/history/CVSRepository.java). For common repository (not a branch) the result will be /usr/bin/cvs log -b <filename>

vladak commented 10 years ago

I tried to setup a test repo like this:

  cvs -d ~/cvsroot init
  export CVSROOT=~/cvsroot
  mkdir foo
  cd foo
  date > foo.txt
  cvs import -m "dir structure" cvsexample yourname start
  cd
  cvs checkout cvsexample
  cd cvsexample
  cvs edit foo.txt
  ...
  cvs log -b foo.txt
  cvs commit
  cvs log -b foo.txt

and the results of the cvs log commands are as expected. It would be nice if you could provide some steps for reproducing the issue.

vladak commented 10 years ago

As for debugging, the instructions are on https://github.com/OpenGrok/OpenGrok/wiki/Developer-intro

conuil commented 10 years ago

Many thanks for all the suggestions. I can't really upgrade to version 12 as this is a production environment. Unless you think version 11 is particularly troublesome. The log command output seems fine. Although the very first comment for the first version is "revision check". I wonder could this be messing it up a little. I'm trying to commit a new file with a similar comment to see if that also causes the error.

conuil commented 10 years ago

Thats it exactly. If you commit a file for the first time with a comment of "revision check" then check will always show up instead of the version number for the latest revision. See images below.

image

opengrokb

Maybe there is a way to change the original comment.

Thanks

Emmet

vladak commented 10 years ago

The processStream() method in https://github.com/OpenGrok/OpenGrok/blob/master/src/org/opensolaris/opengrok/history/CVSHistoryParser.java#L96 indeed picks up the revision string:

            if (state == ParseState.REVISION && s.startsWith("revision")) {
                                                ^^^^^^^^^^^^^^^^^^^^^^
                if (entry != null) {
                    entries.add(entry);
                }
                entry = new HistoryEntry();
                entry.setActive(true);
                String commit = s.substring("revision".length()).trim();
                entry.setRevision(commit);
                if (tags.containsKey(commit)) {
                    entry.setTags(tags.get(commit));
                }
                state = ParseState.METADATA;
                s = in.readLine();
            }

Maybe the code could first check the format of the substring and bail out if it does not match something like [0-9\.]*.

vladak commented 10 years ago

of course it's a question why the parser thinks it is in the ParseState.REVISION state.

Could you post full output of cvs log -b testOpengrok.txt ?

conuil commented 10 years ago

Sure thing. Incidentally I can't seem to change the "revision check" text. I can change the comment for version 1.1 (see below) but "revision check" still shows up in the log.

Working file: testOpengrok.txt
head: 1.4
branch:
locks: strict
access list:
symbolic names:
keyword substitution: kv
total revisions: 4; selected revisions: 4
description:
revision check
----------------------------
revision 1.4
date: 2014-04-30 14:21:51 +0200;  author: ryanemm;  state: Exp;  lines: +1 -0;  commitid: gU9mXLVzDcosHHyx;
initial version
----------------------------
revision 1.3
date: 2014-04-30 12:31:22 +0200;  author: ryanemm;  state: Exp;  lines: +1 -0;  commitid: oRA8CPsmKh8y5Hyx;
revision 3
----------------------------
revision 1.2
date: 2014-04-30 12:27:50 +0200;  author: ryanemm;  state: Exp;  lines: +1 -0;  commitid: rubxXUsfBo8l4Hyx;
second version
----------------------------
revision 1.1
date: 2014-04-30 12:26:30 +0200;  author: ryanemm;  state: Exp;  commitid: q7hGswuwo2HS3Hyx;
first version
=============================================================================
conuil commented 10 years ago

Sorry, I don't know why its bolding certain text.

vladak commented 10 years ago

No worries, fixed that.

vladak commented 10 years ago

I see now. Could you share the steps on how to create such repository ?

vladak commented 10 years ago

Aha, it's the add command. If I do:

cvs add -m "revision XXX" fff
cvs commit -m "foo"

then the log will look like this:

RCS file: /export/home/vkotal/cvsroot/cvsexample/fff,v
Working file: fff
head: 1.1
branch:
locks: strict
access list:
symbolic names:
keyword substitution: kv
total revisions: 1; selected revisions: 1
description:
revision XXX
----------------------------
revision 1.1
date: 2014-04-30 14:40:59 +0200;  author: vkotal;  state: Exp;  commitid: IEMyhqXa1440OHyx;
foo
=============================================================================
vladak commented 10 years ago

I can see the problem now in the UI. This means we have fully reproducible test case and now we need to find someone to fix this :-)

conuil commented 10 years ago

If anyone wants a quick workaround you can rename the description using cvs admin cvs admin -t-"initial text" testOpengrok.txt