jazzband / django-formtools

A set of high-level abstractions for Django forms
https://django-formtools.readthedocs.io
BSD 3-Clause "New" or "Revised" License
794 stars 135 forks source link

Infinite Recursion possible in 2.4 #220

Closed dave-v closed 1 year ago

dave-v commented 1 year ago

We have a condition_dict containing the following function:

def need_further_steps(wizard):
    data = wizard.get_cleaned_data_for_step("0") or {}
    if data.get("no_invoices_needed", False):
        return False
    return True

The call to WizardView.get_cleaned_data_for_step then calls WizardView.get_form, which used to reference self.form_list, but the following commit changes this to self.get_form_list(), which then calls the user-supplied condition_dict functions, which then calls our need_further_steps function above, which calls WizardView.get_cleaned_data_for_step, and so on ad infinitum. https://github.com/jazzband/django-formtools/commit/533a83053deab2adeb72de079e813aae78b03c1a

Perhaps our custom condition function is bad, but it doesn't seem too unreasonable?

claudep commented 1 year ago

@stuaxo, @schinckel, any idea about how to avoid that regression?

schinckel commented 1 year ago

I think I’ve had similar issues. I’ll try to have a look later tonight to see how I got around it.

schinckel commented 1 year ago

Okay, I've found the similar problem I came across. I think it's the "same" problem, just a different approach.

I had code in get_form_list that needed to see cleaned data from other steps, which triggers the infinite co-recursion.

My workaround was to get the raw data instead, and build up that form and get the cleaned data that way.

We should think about either/both a way to handle this (getting cleaned data from other steps in condition_dict callable), and/or documenting the problem.

schinckel commented 1 year ago

Ah, looking at the code, it seems that whilst get_cleaned_data_for_step(step) doesn't use get_form_list(), it uses get_form, which does.

As an aside, I think for consistency, get_cleaned_data_for_step should use get_form_list.

But that doesn't solve this problem (it just moves it to a different place).

schinckel commented 1 year ago

I wonder if it's possible to use a generator that retains knowledge of previous steps: it should be possible for condition_dict to refer to previous steps, but not subsequent steps.

Let me see what I can get to work.

stuaxo commented 1 year ago

Maybe get_cleaned_data_step could be behind lru_cache with maxsize None.

violuke commented 1 year ago

We're also having the same issue

DJJacobs commented 1 year ago

Experiencing the same issue upon update from 2.3 > 2.4. Reverting back to 2.3 for now. Following for updates...

violuke commented 1 year ago

The fix of #168 has caused this problem. My understanding is that you should be using the conditiion_dict to vary which steps are in the wizard and which are not. I can see why overriding get_form_list() would seem like a nice alternative (and maybe simpler to use), but by changing get_form() to use get_form_list() and solve the request of #168, this makes it impossible to have conditions which look at prior form steps as part of their conditions and therefore breaks what is probably one of the most common use-cases of this package.

My opinion is that 533a83053deab2adeb72de079e813aae78b03c1a should be reverted as it tries to allow the same problem to be solved in a new, previously-not-possible, way and breaks the original way of solving the same problem that's been part of this great package for years.

Thanks for your help, we've also had to pin 2.3 due to this being a breaking change for our use of the package.

spapas commented 1 year ago

I have a different opinion on this. By being able to override the get_form_list you have complete freedom on the list of forms you're going to display in your wizard. See here some example code from a project where a wizard is created dynamically based on some configuration (taken from the PR_ACTION list of dics):

class PilotRequestActionView(..., SessionWizardView):
    ...

    def get_form_list(self):
        form_list = OrderedDict()
        action = self.kwargs["action"]
        has_form = [x for x in PR_ACTIONS if x["code"] == action][0]["has_form"]
        if has_form:
            form_class_str = (
                "PilotRequest"
                + "".join(x.capitalize() for x in action.split("_"))
                + "Form"
            )
            form_class = getattr(forms, form_class_str)
            form_list[action] = form_class
        form_list["submit_req"] = forms.PilotRequestSubmitForm
        return form_list

So I can do something like

path(
        "act/<str:action>/<int:pk>/",
        views.ActionView.as_view(),
        name="req-action",
    ),

in my urls and create different wizards.

I really don't think that something like that would possible using the condition_dict.

schinckel commented 1 year ago

Yeah, the original use case for my patch (and what I've been using as a subclass for many years) was where I needed to add steps dynamically, based upon the submitted data from an earlier step. Specifically, there was a form that needed to be re-used N times, with different initial data.

schinckel commented 1 year ago

So the fix I have (linked in this thread), it "works", but there's not a real way to detect that we have attempted to reference data from a subsequent step (and are inside an infinite recursion), until the Python infinite recursion protection kicks in by running out of stack: at that point we can't recover because, well, we are out of stack.

If there was a neat way of detecting we are in a recursive call, then we could try that - but without sniffing the stack, I wasn't able to detect that.

stuaxo commented 1 year ago

It probably makes sense to step back and list the places ways the data is referenced at different steps.

tilsche commented 1 year ago

We also have this issue that blocks us from upgrading.

stuaxo commented 1 year ago

@schinckel if you want to check if we're in a recursive call you can have a counter starting at 0. One entry to the method increment it, and on exit decrement it.

If, on the way in the counter is not 0 then you are in a recursive call.

jsma commented 1 year ago

Please revert 533a83053, which introduced a major bug and was not a proper solution for the new feature request (to dynamically add forms that aren't in the initial form_list). For those that require this new feature, please provide a working example of your use cases and how you were doing this prior to 2.4. A proper API needs to be created and tested for this new feature request, without breaking existing documented functionality. That has not been done yet. A proper API shouldn't require any hacks around infinite recursion errors, it should be designed so these errors are impossible.

violuke commented 1 year ago

@claudep, @felixxm and @hramezani (I believe you're the Jazzband leads on this project) would it be possible for someone to review this thread? There was a major breaking change in 2.4 and a solution needs to be found if possible ASAP. Thank you.

schinckel commented 1 year ago

Maybe we should revert the change, and work to find a solution that works later. I just don’t have the time to spend on it now.

martey commented 1 year ago

I'm confused about why people are advocating for this change to be reverted "ASAP". Is there a reason why affected people can't stick with 2.3 until a comprehensive solution for this issue is found?

jsma commented 1 year ago

@martey, it should be reverted because 2.4 is a broken release that breaks real projects and costs people valuable time and money. It's not very reasonable to expect everyone to find out the hard way that 2.4 is broken, find this issue, and then downgrade to 2.3, when reverting could avoid all of that. To be clear, we don't need to come up with a comprehensive solution for the infinite recursion bug. The changeset that introduced the bug simply needs to be reverted. Once that is done, a comprehensive solution to the feature request (to be able to dynamically add forms that are not present in form_list) would need to be developed, documented, and tested before being included in a future release.

@schinckel, are you saying you don't have time for reverting the commit or just that you do not have time to work on a proper solution for the feature request? There is already #222 for reverting the change. In my opinion, 2.4 should be yanked from PyPI in favor of the next release. I could submit a PR to update the changelog. Let us know how we can help. Thanks!

schinckel commented 1 year ago

The latter - to come up with a proper solution.

martey commented 1 year ago

@jsma I don't think it's fair to assume that everyone uses django-formtools in the same way, or that everyone is affected by this regression. I think people using condition_dict callables that call get_cleaned_data_for_step (who are affected by this bug) and people using dynamic forms with get_form_list both have valid use cases.

Once that is done, a comprehensive solution to the feature request (to be able to dynamically add forms that are not present in form_list) would need to be developed, documented, and tested before being included in a future release.

This already happened (see #62, #168, #209). The "future release" was 2.4.

Calling this a "new feature" or "feature request" is inaccurate, especially since people have been confused by the fact that form_list is not generated from get_form_list for at least a decade (e.g. this Stack Overflow question from 2012), probably because the method's docstring has always suggested that this is the case.

In my opinion, 2.4 should be yanked from PyPI in favor of the next release.

Removing 2.4 from PyPI and releasing a new version would prevent people using dynamic forms with get_form_list from pinning 2.4. I think this would make things worse, not better.

spapas commented 1 year ago

Hey friends! I'd like to propose a solution to this if I may:

For starters, let's agree that we can't use get_form_list if we also use get_cleaned_data_for_step somewhere. I think that makes sense, if we use a dynamic form we may add the same step over and over so get_cleaned_data_for_step wouldn't even know what to return!

Now, we'll add a FORM_WIZARD_DYNAMIC_FORM_LIST django setting (we can either set it to True or False by default depending on which behavior we want to keep). Then we'll change the get_form line that selects the form class (here https://github.com/jazzband/django-formtools/blob/master/formtools/wizard/views.py#LL409C9-L409C48) to something like:

if settings.FORM_WIZARD_DYNAMIC_FORM_LIST:
  form_class = self.get_form_list()[step]
else:
  form_class = self.form_list[step]

Finally, we'll add the following snippet to the start of get_cleaned_data_for_step:

if settings.FORM_WIZARD_DYNAMIC_FORM_LIST:
  raise Exception("You can't use get_cleaned_data_for_step  when you use a dynamic form list")

I think the above should resolve the problem and be a good solution for both people that need to use a dynamic form list and for those that need to retrieve previous steps data.

An even better solution would be to add the FORM_WIZARD_DYNAMIC_FORM_LIST as an attribute to the WizardView class so each form wizard would be either dynamic or not.

schinckel commented 1 year ago

That's fine, but what if you have a reusable app that does/doesn't, and you use that in your project that has the other setting?

martey commented 1 year ago

I agree that a project-wide Django setting is too broad, but this could be fixed by adding a dynamic_form_list variable to WizardView that defaults to False.

stuaxo commented 1 year ago

This might be an OK compromise - it's hacky, but may be the only sensible way round this.

Is it possible for someone to post the smallest conditional form wizard here, so we can try out solutions ?

fdemmer commented 1 year ago

I propose to not use example projects or code snippets, but actual test code going forward.

To help with that I just opened https://github.com/jazzband/django-formtools/pull/230 which includes a test for using get_cleaned_data_for_step in a condition_dict callback.

stuaxo commented 1 year ago

I'll have a stare at this later + try and understand it (and invite everyone else to) - I (or someone else), should add a test to this for modifying get_form_list() as well so we have both cases.

EDIT: Thanks for making an actual test; I'm pressed for free time, so it's really helpful to have a working example to start from.

stuaxo commented 1 year ago

Gave feedback on the PR - since it involves a few different test improvements; it would be make it easier to merge if the fixes for different things are split into small standalone PRs.

It'll make it easier for people here evaluate the tests that is just for this issue.

fdemmer commented 1 year ago

I think test improvements are dearly needed in this project; my PR is meant as a general improvement of things not just this issue.

Anyway, for the issue at hand you could pull just this commit: https://github.com/jazzband/django-formtools/pull/230/commits/c6d9eda3c1134c2001cac62472826967858b06c7 ... remove the skip() and implement "dynamic_form_list" in a way, that does not break it.

neilmb commented 1 year ago

I would like to use a condition_dict with callable values that access data from previous steps as described in https://django-formtools.readthedocs.io/en/latest/wizard.html#formtools.wizard.views.WizardView.condition_dict. This doesn't seem to work in the 2.4 release because of infinite recursion errors.

What are my options right now? I've thought of:

Are there other options for being able to use this feature that I'm not thinking of?

spapas commented 1 year ago

@neilmb the other option is to override get_form of your class like this (see this method https://github.com/jazzband/django-formtools/blob/master/formtools/wizard/views.py#L391 ; I have changed it to not use the get_form_list() method):

    def get_form(self, step=None, data=None, files=None):
        if step is None:
            step = self.steps.current
        form_class = self.form_list[step]
        kwargs = self.get_form_kwargs(step)
        kwargs.update({
            'data': data,
            'files': files,
            'prefix': self.get_form_prefix(step, form_class),
            'initial': self.get_form_initial(step),
        })
        if issubclass(form_class, (forms.ModelForm, forms.models.BaseInlineFormSet)):
            kwargs.setdefault('instance', self.get_form_instance(step))
        elif issubclass(form_class, forms.models.BaseModelFormSet):
            kwargs.setdefault('queryset', self.get_form_instance(step))
        return form_class(**kwargs)
JanMalte commented 1 year ago

I'm confused about why people are advocating for this change to be reverted "ASAP". Is there a reason why affected people can't stick with 2.3 until a comprehensive solution for this issue is found?

A reason could be that version 2.4.0 added official support for django v4

Maybe a version 2.3.1 could be released with the old behaviour and official django v4 support? So we can pin to <2.4 until the issue is somehow resolved and upgrade/use django v4.

violuke commented 1 year ago

And in an ideal world, SemVer would be used so people could understand version numbers and use modern package managers like Poetry properly. This being a breaking change could then be safely introduced using version 3.0.0 instead of 2.4.0, if it was wanted (not that I'm sure it should be tbh). Having come from other languages to Python in the last few years, the Python community seems quite out of sync with the rest on the SemVer front and SemVer is really damn handy when managing dependencies.

stuaxo commented 1 year ago

OK, so I had a look back at the original PR and a bit of a think -

There are two use cases, one is an API for callers outside the wizard to get a list of forms, and the case where inside get_form we need a list.

If these have separate names we can avoid the issue.

schinckel commented 1 year ago

The original use case was to dynamically inject forms: specifically given the input from step A, run n instances of step B with different data, as “sub-steps”.

stuaxo commented 1 year ago

In that case is it enough to keep it and people use the solution @spapas proposes if people want to overide get_form ?

sagesharp commented 1 year ago

Hi folks! We are also being bitten by this infinite recursion bug. There was an ask for examples of forms with this issue. I don't have the simplest example (as was the ask), but you can take a look at our usage case in the Outreachy website repository. I've documented steps to reproduce the issue in the Outreachy website issue tracker

Outreachy is an internship program that supports diversity in open source. Outreachy uses django-formtools to have potential interns fill out an application. We receive over 3,000 applications per internship cohort, so we really appreciate this Django package! :heart:

We'll pin our django-formtools version to 2.3 until this issue can be fixed. I'll also take a look at the other solutions posted here. I don't know if I'll have time to make them work before our applications open on January 16, 2023, but I'll try. :sweat_smile:

claudep commented 1 year ago

@schinckel, any opinion about the possible fix in #239 ?

schinckel commented 1 year ago

Yeah, that looks okay to me.

MrkGrgsn commented 1 year ago

I've just submitted a PR that attempts a more complete fix for this problem. Please have a look and give it a test if you've got time.