langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
35.59k stars 4.8k forks source link

Call Get /v1/datasets error: missing 1 required positional argument: 'tenant_id' #5629

Closed Schumpeterx closed 2 days ago

Schumpeterx commented 3 days ago

Self Checks

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

Create a new flask project, copy api/controllers/service_api/dataset/dataset.py and other related .py files to the new flask project. Make sure the flask project can running.

Then call GET /v1/datasets or POST /v1/datasets use postman. And will encounter a error said : missing 1 required positional argument: 'tenant_id'

I guess the reason is because the route less a positional argument in file api/controllers/service_api/dataset/dataset.py

class DatasetListApi(DatasetApiResource):
    """Resource for datasets."""

    def get(self, tenant_id): # here require a positional argument

while the route is

api.add_resource(DatasetListApi, '/datasets') # dose not define a  positional argument

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 3 days ago

@Schumpeterx Hello! I'm here to help you with any bugs, questions, or contributions while we wait for a human maintainer. Let's tackle this together!

To resolve the "missing 1 required positional argument: 'tenant_id'" error when calling the GET /v1/datasets or POST /v1/datasets endpoints in your self-hosted Flask project, you need to ensure that the tenant_id parameter is correctly handled within your application.

Based on the provided code, it appears that the tenant_id is being managed through the current_user.current_tenant_id attribute rather than being passed as a URL parameter. Therefore, you do not need to modify the route definitions to include tenant_id in the URL pattern.

Here is how you can ensure that the tenant_id is correctly utilized:

  1. Ensure current_user is properly set up: Make sure that the current_user object is correctly initialized and contains the current_tenant_id attribute.

  2. Check the DatasetListApi class methods: The get and post methods in the DatasetListApi class already use current_user.current_tenant_id to fetch datasets and create new datasets. Ensure that this attribute is correctly populated.

Here is a snippet from the provided code that shows how tenant_id is used:

class DatasetListApi(Resource):

    @setup_required
    @login_required
    @account_initialization_required
    def get(self):
        page = request.args.get('page', default=1, type=int)
        limit = request.args.get('limit', default=20, type=int)
        ids = request.args.getlist('ids')
        provider = request.args.get('provider', default="vendor")
        search = request.args.get('keyword', default=None, type=str)
        tag_ids = request.args.getlist('tag_ids')

        if ids:
            datasets, total = DatasetService.get_datasets_by_ids(ids, current_user.current_tenant_id)
        else:
            datasets, total = DatasetService.get_datasets(page, limit, provider,
                                                          current_user.current_tenant_id, current_user, search, tag_ids)

        # check embedding setting
        provider_manager = ProviderManager()
        configurations = provider_manager.get_configurations(
            tenant_id=current_user.current_tenant_id
        )

        embedding_models = configurations.get_models(
            model_type=ModelType.TEXT_EMBEDDING,
            only_active=True
        )

        model_names = []
        for embedding_model in embedding_models:
            model_names.append(f"{embedding_model.model}:{embedding_model.provider.provider}")

        data = marshal(datasets, dataset_detail_fields)
        for item in data:
            if item['indexing_technique'] == 'high_quality':
                item_model = f"{item['embedding_model']}:{item['embedding_model_provider']}"
                if item_model in model_names:
                    item['embedding_available'] = True
                else:
                    item['embedding_available'] = False
            else:
                item['embedding_available'] = True

        response = {
            'data': data,
            'has_more': len(datasets) == limit,
            'limit': limit,
            'total': total,
            'page': page
        }
        return response, 200

    @setup_required
    @login_required
    @account_initialization_required
    def post(self):
        parser = reqparse.RequestParser()
        parser.add_argument('name', nullable=False, required=True,
                            help='type is required. Name must be between 1 to 40 characters.',
                            type=_validate_name)
        parser.add_argument('indexing_technique', type=str, location='json',
                            choices=Dataset.INDEXING_TECHNIQUE_LIST,
                            nullable=True,
                            help='Invalid indexing technique.')
        args = parser.parse_args()

        # The role of the current user in the ta table must be admin, owner, or editor
        if not current_user.is_editor:
            raise Forbidden()

        try:
            dataset = DatasetService.create_empty_dataset(
                tenant_id=current_user.current_tenant_id,
                name=args['name'],
                indexing_technique=args['indexing_technique'],
                account=current_user
            )
        except services.errors.dataset.DatasetNameDuplicateError:
            raise DatasetNameDuplicateError()

        return marshal(dataset, dataset_detail_fields), 201

In summary, ensure that the current_user object is correctly initialized and contains the current_tenant_id attribute. The existing code already handles the tenant_id through current_user.current_tenant_id, so no changes to the route definitions are necessary [1][2].

To continue talking to Dosu, mention @dosu.

laipz8200 commented 2 days ago

Hi! We couldn't reproduce this issue. Please report only problems related to the original codebase rather than copying the code elsewhere.

If you need further help, please feel free to reopen this issue.