jimregan / foma

Automatically exported from code.google.com/p/foma
0 stars 0 forks source link

memory hog in apply.c xxstrdup, string never freed, if used in flookup. #45

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. use Hungarian foma's result hunfnnum.fst, Hungarian foma downloadable from 
https://gitorious.org/hunmorph-foma/hunmorph-foma/trees/master
2. do_testup.sh (attached), watch flookup size using ps, it is 39.6 MB
3. perl x.pl <x >x1 (both attached)
4.  perl /home/en/program/foma/tktest/szokincsteszt/szeged/chkwdlistup.pl x1 > 
x2
  this will take about 5-10 minutes, after the test flookup size increases to > 54 MB
  (chkwdlistup.pl attached)

What is the expected output? What do you see instead?
The problem is, that apply.c 
----------------------------------------------------------------
int apply_check_flag(struct apply_handle *h, int type, char *name, char *value) 
{
    struct flag_list *flist, *flist2;
    for (flist = h->flag_list; flist != NULL; flist = flist->next) {
    if (strcmp(flist->name, name) == 0) {
        break;
    }
    }
    h->oldflagvalue = flist->value;
    h->oldflagneg = flist->neg;

    if (type == FLAG_UNIFY) {
    if (flist->value == NULL) {
        flist->value = xxstrdup(value);  /* this causes the hog */
        return SUCCEED;
    }
--------------------------------------------------
duplicates a string, and never ever frees it. I found a solution, that fixes 
the problem:
in flookup.c:
at the declarations:
extern void apply_clean();
extern  void apply_clean_start();
.....
void handle_line(char *s) {
    char *result, *tempstr;
    apply_clean_start();

....
    }
    apply_clean();
}

In apply.c:
In declarations:
static int apply_clean_variable;
#define MAX_SAVED 10
static struct flag_list *saved_flag_list[MAX_SAVED];
static char *saved_values[MAX_SAVED];
static int clean_ix;
static void apply_add_clean_list(struct flag_list *flist, char *value);
void apply_clean();
void apply_clean_start();
....
int apply_check_flag(struct apply_handle *h, int type, char *name, char *value) 
{
    struct flag_list *flist, *flist2;
    for (flist = h->flag_list; flist != NULL; flist = flist->next) {
    if (strcmp(flist->name, name) == 0) {
        break;
    }
    }
    h->oldflagvalue = flist->value;
    h->oldflagneg = flist->neg;

    if (type == FLAG_UNIFY) {
    if (flist->value == NULL) {
        flist->value = xxstrdup(value);
            apply_add_clean_list(flist, flist->value);
        return SUCCEED;
    }
....

void apply_clean_start()
{
  apply_clean_variable = 1;
}
void apply_add_clean_list(struct flag_list *flist, char *value)
{
   if(apply_clean_variable){
     saved_flag_list[clean_ix] = flist;
     saved_values[clean_ix] = value;
     if(++clean_ix >= MAX_SAVED){
        clean_ix = 0;
     }
   }
}
void apply_clean(){
  if(apply_clean_variable){
      int i;
      for(i = 0; i < clean_ix; i++){
        xxfree(saved_values[i]);
        saved_flag_list[i]->value = NULL;
        saved_values[i] = NULL;
        saved_flag_list[i] = NULL;
      }
      clean_ix = 0;
      apply_clean_variable = 0;
  }
}
void apply_clean_start();
void apply_add_clean_list(struct flag_list *flist, char *value);
void apply_clean();

What version of the product are you using? On what operating system?
Newest from svn, linux debian

Please provide any additional information below.
The solution's description:
1. sign that we are enter flookup, 
set apply_clean_variable = 1;

2. If apply_clean_variable == 1, remember all strdups in a list, max 10 of them

3. when leaving flookup, free all strings in list, put NULL into their pointer 
in struct flag_list *flist, set apply_clean_variable = 0;

----------------------------------

I have also tried to eliminate the strdup in apply_check_flag, and pass back a  
FAIL, however, in that case lots of words were not found, that is functionality 
of foma fails.

Original issue reported on code.google.com by eleonor...@gmx.net on 16 Jan 2013 at 2:26

Attachments: