ridiculouz / LLMaAA

The official repository for paper "LLMaAA: Making Large Language Models as Active Annotators"
32 stars 3 forks source link

Command line examples #4

Open pvcastro opened 2 months ago

pvcastro commented 2 months ago

Hi there! Congratulations for the work! Do you mind sharing some sample command lines for running each step of the pipeline mentioned in the README? I feel like I'm having to make too many changes to the code, and maybe it's because I'm not passing each parameter accordingly. For an example, the active_annotate script, for the relation extraction task using retacred, is giving me a hard time. The features loaded from the Processor aren't working, it's always giving a different exception at some point. Thanks!

ridiculouz commented 1 month ago

Hi Pedro, If you are facing ImportError with the relative import problems, you can first try running these scripts with -m option, e.g. python -m src.demo_retrieval. See #3 for previous discussion. Feel free to reach out if you have any further problems!

pvcastro commented 1 month ago

Hi @ridiculouz , no, I got around those just fine. I'm running into runtime exceptions due to labels being none, things like that. For instance, in src/data/re_reader.py, all samples end up having None labels because they are not in cache, and the labels aren't set anywhere else.

ridiculouz commented 1 month ago

Ah I see. The cache file stores all samples labeled by GPT, and if the labeling process fails due to connection error or sth, it will end up with None label.