rcls / crap

Cvs Remote Access Program
GNU General Public License v3.0
46 stars 12 forks source link

Don't just accept git-forbidden tags #26

Open eyalroz opened 4 years ago

eyalroz commented 4 years ago

Some tags which are acceptable in CVS, are unacceptable in git. See this StackOverflow answer for a list of disallowed character sequences in tag names.

crap doesn't check tag names against these restrictions, and just goes ahead and accepts them into the tag database in read_file_versions(). Now, when it later encounters such a tag in going over the commit / tagging history, something fails/crashes (perhaps cvs-fast-export? git-fast-import?) and the process is terminated.

I'm not exactly sure what the solution should be, since I'm having trouble following to code to realize exactly where the tag name is used later on.

A (poor) workaround which I've somehow managed to hack together is simply not inserting problematic tag into the tag DB within read_file_versions(); but it's not the right thing to do. What is the right thing to do?

Option 1:

Option 2:

Option 3:

Support both options 1 and 2, and let a command-line switch choose between them. Default to escaping rather than deleting, I would say.

eyalroz commented 4 years ago

I'm not offering a PR this time (probably), but here is a bit of code:

bool tag_name_is_valid(const char* restrict str, size_t len)
{
    // Rules as appear here:
    // https://stackoverflow.com/a/26382358/1593077

    // rule 1
    if (str[0] == '.') { return false; }
    if (strstr(str, "/.")) { return false; }
    if (strstr(str, ".lock/")) { return false; }
    if (strcmp(str + len - 5, ".lock") == 0) { return false; }
    // rule 2 - unused, since theoretically we could have "allow-onelevel"
    // if (strchr(str, '/') == NULL) { return false; }
    // rule 3
    if (strstr(str, "..")) { return false; }
    // rule 4
    for(size_t i; i < len; i++) {
        char c = str[i];
        bool is_ascii_control_char = (c < 0x20) || (c == 0x7F /* DEL */);
        if (is_ascii_control_char || c == ' ' || c == '~' || c == '^' || c ==':' ) return false;
    }
    // rule 4
    if ( (strpbrk(str, "?*[") != NULL) ) { return false; }
    // rule 5
    if ( (strpbrk(str, "?*[") != NULL) ) { return false; }
    // rule 6
    if ( (str[0]) || (str[len - 1] == '/') || (strstr(str, "//") != NULL) ) { return false; }
    // rule 7
    if (str[len - 1] == '.') { return false; }
    // rule 8
    if (strchr(str,"@{") != NULL) { return false; }
    // rule 9
    if (str[0] == '@' && len == 1) { return false; }
    // rule 10
    if (strchr(str, '\\') != NULL) { return false; }
    return true;
}