expectedparrot / edsl

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
https://docs.expectedparrot.com
MIT License
97 stars 14 forks source link

Conjure: If a field has already been renamed, just show a message, do not throw an exception #681

Open rbyh opened 1 week ago

rbyh commented 1 week ago

You may want to keep a line in your notebook that renames fields (eg question names) and wind up rerunning it as you work on it. Instead of throwing an exception that a field has already been renamed, it would be fine to just show a message that it is already done, no change needed.

rbyh commented 1 week ago

ie this message may just be disruptive, not helpful:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 1
----> 1 c.rename("my_old_name", "my_new_name")

File [~/edsl/edsl/conjure/InputData.py:147](http://localhost:8888/lab/tree/~/edsl/edsl/conjure/InputData.py#line=146), in InputDataABC.rename(self, old_name, new_name)
    139 def rename(self, old_name, new_name) -> 'InputData':
    140     """Rename a question.
    141     
    142     >>> id = InputDataABC.example()
   (...)
    145     
    146     """
--> 147     idx = self.question_names.index(old_name)
    148     self.question_names[idx] = new_name
    149     self.answer_codebook[new_name] = self.answer_codebook.pop(old_name, {})

ValueError: 'my_new_name' is not in list
johnjosephhorton commented 1 week ago

Can you say a bit more about this one, as I don't understand it. If you're re-loading from the 'raw' input file, you'll need to go through this again.

rbyh commented 1 week ago

Right, I mean when you are not reimporting the raw data but otherwise rerunning all your code while you work on it. I just think it would be a better experience if when you try to do something that is already done you get a message instead of an exception that stops the execution.

johnjosephhorton commented 1 week ago

I still don't get why you'd do this though - can you give me an example? I would expect the code that is re-run to start from the raw data and go all the way to execution.

One thought - a Conjure object should perhaps store "operations" that have been done to it, like re-namings or option re-orderings, question drops etc. Then, when re-instantiated, it re-runs those operations. If they are done again, it's basically idempotent i.e., it doesn't do them again.

rbyh commented 4 days ago

If you are rerunning part of your code and inadvertently rerun a rename command, it should simply check and not do anything, not throw an error which interrupts everything.

johnjosephhorton commented 4 days ago

I'm going to disagree with this one. Suppose you start with "bad" and rename to "good." If you run a re-name command later, proposing "bad" -> "good" again. The code is going to first look for the "bad" column. When it doesn't find it, it would then have to say "oh, wait - it's renaming to 'good' I already have a column named 'good' - presumably this was set earlier" - but this would require the object keeping a memory of re-namings, which might be worthwhile but I'm not sure.

It also has to make an inference. Suppose user has a 'bad2" column for real. Maybe the used mis-types "bad2 -> good2" as "bad -> good2" - it will do nothing but this would be bad.