logancyang / obsidian-copilot

THE Copilot in Obsidian
GNU Affero General Public License v3.0
2.55k stars 177 forks source link

[FR] List reference documents #336

Closed wwjCMP closed 6 months ago

wwjCMP commented 6 months ago

Is it possible to list the contents of which files were referenced in the answer in Valut QA mode? Just like Microsoft's copilot.

logancyang commented 6 months ago

Do you have a specific screenshot of what you would like to see?

wwjCMP commented 6 months ago

image

logancyang commented 6 months ago

Doesn't it already do that? Or do you mean the markers among the sentences specifically

image
wwjCMP commented 6 months ago

Very strange, I haven't achieved this effect here, I'm using GEMINI PRO

wwjCMP commented 6 months ago

I found that I did not correctly index my valut

wwjCMP commented 6 months ago

It is necessary to add a feature to skip index timeout files, as there will always be some files blocking the progress during the index process

wwjCMP commented 6 months ago

Snipaste_2024-03-06_20-29-12 Always stuck in this number

wwjCMP commented 6 months ago

Snipaste_2024-03-07_04-54-39 What's the problem with this

logancyang commented 6 months ago

Snipaste_2024-03-07_04-54-39 What's the problem with this

I haven't seen Cohere CORS issue before, but there's 1 line showing a local call with CORS issue. Was your embedding model set as Cohere or ollama? If ollama, you must start the server with the right env variable as mentioned in the local copilot setup guide.

As for the indexing exclusion, it is coming in the next release https://github.com/logancyang/obsidian-copilot/issues/334

wwjCMP commented 6 months ago

Snipaste_2024-03-07_04-54-39这有什么问题

我以前没有见过 Cohere CORS 问题,但有 1 行显示存在 CORS 问题的本地调用。您的嵌入模型设置为 Cohere 还是 ollama?如果是 ollama,您必须使用本地 copilot 设置指南中提到的正确环境变量启动服务器。

至于索引排除,它将在下一个版本中出现#334

I can use Ollma by starting it correctly, but its performance is really poor

wwjCMP commented 6 months ago

(obsidian://open?vault=scientific research&file=2022-%E4%BA%8C%E7%A1%AB%E5%8C%96%E9%92%BC%E7%9A%84%E7%94%B5%E5%AD%90%E8%83%BD%E5%B8%A6%E7%BB%93%E6%9E%84%E5%92%8C%E4%BD%8E%E6%B8%A9%E8%BE%93%E8%BF%90%E5%AE%9E%E9%AA%8C%E8%BF%9B%E5%B1%95) This part of the answer is not necessary, sometimes it can go wrong. We just need to know the file information

wwjCMP commented 6 months ago

Snipaste_2024-03-07_04-54-39这有什么问题

我以前没有见过 Cohere CORS 问题,但有 1 行显示存在 CORS 问题的本地调用。您的嵌入模型设置为 Cohere 还是 ollama?如果是 ollama,您必须使用本地 copilot 设置指南中提到的正确环境变量启动服务器。 至于索引排除,它将在下一个版本中出现#334

I can use Ollma by starting it correctly, but its performance is really poor

The poor performance is unbelievable, and I doubt whether the entire workflow is executed correctly. My valut is in Chinese

logancyang commented 6 months ago

@wwjCMP the nomic embedding model is English only, you need OpenAI embedding models for Chinese

wwjCMP commented 6 months ago

@wwjCMP the nomic embedding model is English only, you need OpenAI embedding models for Chinese

i got it

wwjCMP commented 6 months ago

@wwjCMP the nomic embedding model is English only, you need OpenAI embedding models for Chinese

I found it wasn't just the voice that was the problem. Even if I use the Chinese local embedding model and the referenced file also contains relevant content, the large model still cannot output valid content. It's either that I can't answer or just a few simple words.

wwjCMP commented 6 months ago

LLM:GEMINI PRO embedding:Dmeta-embedding-zh , through ollama

Relevant literature can be retrieved, but valid answers cannot be generated

wwjCMP commented 6 months ago

(obsidian://open?vault=scientific research&file=2022-%) Cannot be displayed as a correct link, only displayed as name plus link

logancyang commented 6 months ago

@wwjCMP how did you use embedding:Dmeta-embedding-zh , through ollama? Through proxy? I thought it's broken now https://github.com/logancyang/obsidian-copilot/issues/332

Or do you mean you created an ollama custom model to be served from the ollama local server? I haven't tried that yet.

By Relevant literature can be retrieved, but valid answers cannot be generated, do you mean you can see things below "sources"? Then the vector search should be fine. Gemini pro should work with Chinese. Can you attach screenshots with you chat along with the console

logancyang commented 6 months ago

@wwjCMP sorry I think you mentioned before but I forgot, what are some of the best LLM proxy API providers now? Best if they are free, I need to test the override feature with them.

wwjCMP commented 6 months ago

@wwjCMP how did you use embedding:Dmeta-embedding-zh , through ollama? Through proxy? I thought it's broken now #332

Or do you mean you created an ollama custom model to be served from the ollama local server? I haven't tried that yet.

By Relevant literature can be retrieved, but valid answers cannot be generated, do you mean you can see things below "sources"? Then the vector search should be fine. Gemini pro should work with Chinese. Can you attach screenshots with you chat along with the console

I simply named it nomic-embed-text when using the self-built model. It seems that this is just what is needed for copilot to be able to call it.

wwjCMP commented 6 months ago

For ollama, it seems that various customized models can be deployed, and they only need to be converted into gguf format, but the performance has not been tested.

wwjCMP commented 6 months ago

https://github.com/ollama/ollama/issues/2965

wwjCMP commented 6 months ago

{This article does not mention vacancy regulation. Sources: [Research-2024-Fund-Transition Metal Disulfides-Vacancy Regulation](obsidian://open?vault=scientific research&file=%E8%B0%83%E7%A0%94-2024-%E5%9F% BA%E9%87%91-%E8%BF%87%E6%B8%A1%E9%87%91%E5%B1%9E%E4%BA%8C%E7%A1%AB%E5%8C%96 %E7%89%A9-%E7%A9%BA%E4%BD%8D%E8%B0%83%E6%8E%A7) [2020-Introduction of coupling in Ni-doped CoSe by interlayer deintercalation method Vacancy pairs to improve electrocatalytic hydrogen production performance](obsidian://open?vault=scientific research&file=2020-%E9%80%9A%E8%BF%87%E5%B1%82%E9%97%B4%E8 %84%B1%E6%8F%92%E5%85%A5%E6%B3%95%E5%9C%A8Ni%E6%8E%BA%E6%9D%82CoSe%E4%B8%AD%E5%BC %95%E5%85%A5%E8%80%A6%E5%90%88%E7%9A%84%E7%A9%BA%E4%BD%8D%E5%AF%B9%2C%E6%94 %B9%E5%96%84%E7%94%B5%E5%82%AC%E5%8C%96%E5%88%B6%E6%B0%A2%E6%80%A7%E8%83%BD ) [2022-Improving the intrinsic activity and selectivity of single-atom Rh in syngas-c2 oxygenates by modifying local coordination atoms](obsidian://open?vault=scientific research&file=2022-%E9%80% 9A%E8%BF%87%E4%BF%AE%E9%A5%B0%E5%B1%80%E9%83%A8%E9%85%8D%E4%BD%8D%E5%8E%9F% E5%AD%90%EF%BC%8C%E6%8F%90%E9%AB%98%E5%90%88%E6%88%90%E6%B0%94-%20c2%E6%B0%A7 %E5%90%88%E7%89%A9%E4%B8%AD%E5%8D%95%E5%8E%9F%E5%AD%90Rh%E7%9A%84%E6%9C%AC%E5 %BE%81%E6%B4%BB%E6%80%A7%E5%92%8C%E9%80%89%E6%8B%A9%E6%80%A7) [Literature Knowledge Notes: Heterojunction ( Spin+Valley) valve](obsidian://open?vault=scientific research&file=%E6%96%87%E7%8C%AE%E7%9F%A5%E8%AF%86%E7%AC%94% E8%AE%B0%EF%BC%9A%E5%BC%82%E8%B4%A8%E7%BB%93(%E8%87%AA%E6%97%8B%2B%E8%B0%B7 )%E9%98%80) [2021-Steric hindrance and functional promotion of high performance in electrochemical CO methanation on MoS2 and WS2 anti-site defects](obsidian://open?vault=scientific research&file=2021-%E7 %A9%BA%E9%97%B4%E4%BD%8D%E9%98%BB%E5%92%8C%E5%8A%9F%E8%83%BD%E4%BF%83%E8%BF %9B%E7%94%B5%E5%8C%96%E5%AD%A6%20CO%20%E7%94%B2%E7%83%B7%E5%8C%96%E5%9C%A8%20MoS2 %20%E5%92%8C%20WS2%20%E5%8F%8D%E4%BD%8D%E7%BC%BA%E9%99%B7%E4%B8%8A%E7%9A%84%E9 %AB%98%E6%80%A7%E8%83%BD) [2019-Coordinated project to reduce carbon dioxide with cobalt nitrogen functionalized materials](obsidian://open?vault=scientific research&file=2019-%E9%92% B4%E6%B0%AE%E5%8A%9F%E8%83%BD%E5%8C%96%E6%9D%90%E6%96%99%E5%87%8F%E5%B0%91% E4%BA%8C%E6%B0%A7%E5%8C%96%E7%A2%B3%E7%9A%84%E5%8D%8F%E8%B0%83%E5%B7%A5%E7% A8%8B) [Coordination number](obsidian://open?vault=scientific research&file=%E9%85%8D%E4%BD%8D%E6%95%B0) [2021-Metal nitrogen doping for CO2 electroreduction Heterocarbon single-atom electrocatalyst](obsidian://open?vault=scientific research&file=2021-CO2%E7%94%B5%E8%BF%98%E5%8E%9F%E7%94%A8%E9%87 %91%E5%B1%9E%E6%B0%AE%E6%8E%BA%E6%9D%82%E7%A2%B3%E5%8D%95%E5%8E%9F%E5%AD%90 %E7%94%B5%E5%82%AC%E5%8C%96%E5%89%82)} i translate it in English

wwjCMP commented 6 months ago

Great! Let's dive into the properties of MoS2.

  1. Electronic Properties:

    • Bandgap: MoS2 has an indirect bandgap, which makes it suitable for optoelectronic applications like solar cells and LEDs.
    • Surface States: The presence of S 2p orbitals results in a type-II bandgap, allowing for efficient photovoltage generation.
    • Valence Band Edge (VBE): The VBE is located at the K point in the Brillouin zone, making it a good candidate for photoanodes in solar cells.
    • Electron-Phonon Coupling: This material exhibits strong electron-phonon coupling, which can affect its stability and device performance.
  2. Catalytic Activity:

    • Oxidation/Reduction Reactions: MoS2 has been shown to catalyze oxygen reduction reactions (ORR) in fuel cells due to its ability to adsorb and activate O2 molecules.
    • CO2RR: As you mentioned, it's a promising material for CO2 reduction to methanol or other value-added chemicals. The single atom catalysts on MoS2 can significantly enhance the reaction rate and selectivity.
  3. Applications:

    • Electronics: Its semiconducting properties make MoS2 suitable for thin film transistors, field-effect transistors, and photodetectors.
    • Energy Storage: It's used in batteries due to its high charge carrier mobility and excellent rate capability.
    • Sensing: The surface chemistry of MoS2 can be tailored for specific gas sensing applications.
  4. Structural Variations:

    • Single Atom Catalysis: As you've seen, single Co atoms on MoS2 can enhance the CO2 reduction process, demonstrating its potential as a catalyst.
    • MXene Support: In your recent study, Mo2CS2-MXene composites showed enhanced catalytic activity for CO2 electroreduction, highlighting the synergistic effect of MXene and MoS2.
  5. Synthesis and Characterization:

    • Synthesis Methods: Various techniques like chemical vapor deposition (CVD), exfoliation, or mechanical methods can be employed to prepare MoS2.
    • Characterization Techniques: Techniques like XRD, Raman spectroscopy, and scanning electron microscopy are used to confirm the quality and structure of the material.

These are some key aspects. Let me know if you need more detailed information on any specific topic!

Sources:

I switched my LLM to qwen1.5 LM studio and it can output correctly, but the speed is much slower than usual. It seems that the problem occurs with GEMINI PRO. There is also the performance degradation caused by mixing Chinese and English. I hope that langchain will provide Chinese switching options in the future. Like here, I asked a question in Chinese, but the answer was in English. This creates a bad experience.

wwjCMP commented 6 months ago

Another problem is that the content retrieved through the embedding model contains a large number of link content, such as web links, external jump links, and internal embedded links (pictures, di-directional links, etc.). These contents are worthless for LLM to product answers . I wonder if they can be excluded from the langchain in exchange for more valid tokens.

wwjCMP commented 6 months ago

When there are continuous conversations in QA vault mode, do not use the source content at the end of the above conversation as memory input. This part seems to be worthless.

wwjCMP commented 6 months ago

I used Google Translate for my answer, there may be many errors, please forgive me.

logancyang commented 6 months ago

Just write Chinese,谷歌翻译太奇怪完全看不懂😅

logancyang commented 6 months ago

可以加入我们的discord,在contribute那里最下面

wwjCMP commented 6 months ago

好的,上面说的几点是

  1. 换成本地的qwen1.5 模型之后可以正常回答,所以中文嵌入模型应该是正常工作。GEMINI PRO不知道为什么不能正常工作。
  2. 虽然提问和检索到的语料都是中文,但回答是英文,应该是预设的提示词是英文。
  3. 由于obsidian笔记的书写的原因,检索到的语料块中可能包含各种链接信息,比如双链,比如图片链接,比如web链接,外部跳转链接,但这些信息都是无用的,希望可以在发送给LLM之前用正则去除掉。
  4. 最后的参考链接不能正确显示,有时候链接也不能正确打开。
  5. 全局问答模式下,不要把当前回答的参考链接部分内容作为记忆发送给LLM
wwjCMP commented 6 months ago

可以加入我们的discord,在contribute那里最下面

找不到discord链接在哪里

logancyang commented 6 months ago

https://discord.gg/CYDvNtGHkQ

wwjCMP commented 6 months ago

更新2.5.1之后,嵌入模型不起作用了

logancyang commented 6 months ago

@wwjCMP 有试clear vector store 然后重新index吗

wwjCMP commented 6 months ago

@wwjCMP 有试clear vector store 然后重新index吗

重新index之后可以了

wwjCMP commented 5 months ago

Snipaste_2024-03-15_15-02-55 gemini 这个最后的链接显示不正确有办法处理吗