ValueError: `temperature` (=0.0) has to be a strictly positive float, otherwise your next token scores will be invalid. If you're looking for greedy decoding strategies, set `do_sample=False`.

CAICCU commented 2 months ago

This is my settings.yaml settings.yaml

llm:
  api_key: EMPT
  api_base: http://0.0.0.0:9997/v1
  type: openai_chat # or azure_openai_chat
  model: chatglm3-6b # gpt-4-turbo-preview
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 4096
  # request_timeout: 180.0
  # api_base: https://<instance>.openai.azure.com
  # api_version: 2024-02-15-preview
  # organization: <organization_id>
  # deployment_name: <azure_model_deployment_name>
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  # concurrent_requests: 25 # the number of parallel inflight requests that may be made

I use xinference api to run python -m graphrag.index --root ./ragtest,
but in create_base_extracted_entities step have bug in xinference:
ValueError: `temperature` (=0.0) has to be a strictly positive float, otherwise your next token scores will be invalid. If you're looking for greedy decoding strategies, set `do_sample=False`.
How to solve it?

AlonsoGuevara commented 2 months ago

Hi @CAICCU If you're running from the source, on #390 and #373 temperature param support was added, so a new build can fix this.

zanderjiang commented 2 months ago

glm does not support 0.0 or 1.0 for its temperature and top_p, which are the default configs, you need to change it.

microsoft / graphrag

ValueError: `temperature` (=0.0) has to be a strictly positive float, otherwise your next token scores will be invalid. If you're looking for greedy decoding strategies, set `do_sample=False`. #416