Encountering KeyError and NaN label issues with end-to-end experiments

LechengKong / OneForAll

A fundational graph learning framework that solves cross-domain/cross-task classification problems using one model.

MIT License

161 stars 22 forks source link

Encountering KeyError and NaN label issues with end-to-end experiments #1

Closed FUTUREEEEEE closed 9 months ago

FUTUREEEEEE commented 10 months ago

Hi lecheng,

Thanks for the open source code, I run into an error when I use end-to-end experiments.

First is the KeyError: 'prompt_node_edge_feat' on dataset coralink and coranode.

Second is the nan in labels on chempcba chemhiv chemblpre datasets.

Could you please help me understand what might be causing these issues and how to resolve them?

LechengKong commented 10 months ago

Hi @FUTUREEEEEE ,

Thank you for being so interested in our work! Sorry about the naming issues. They should be fixed in the most recent commit. A version mismatch in pull request caused it. You can follow the new Readme to run the code. We are still cleaning up the low resource version and combining it into the e2e version, it should be online soon.

For the nan label issue, you should see nan labels in chempcba and chemblepre datasets, as some tasks indeed have no ground truth. The recent commit will ignore these unmeasured labels as well, so the code should run correctly. During inference, we also ignore these labels.

Hope this solves the issues. If they persist, feel free to leave a comment.

Best, @LechengKong

FUTUREEEEEE commented 10 months ago

Hi @FUTUREEEEEE ,

Thank you for being so interested in our work! Sorry about the naming issues. They should be fixed in the most recent commit. A version mismatch in pull request caused it. You can follow the new Readme to run the code. We are still cleaning up the low resource version and combining it into the e2e version, it should be online soon.

For the nan label issue, you should see nan labels in chempcba and chemblepre datasets, as some tasks indeed have no ground truth. The recent commit will ignore these unmeasured labels as well, so the code should run correctly. During inference, we also ignore these labels.

Hope this solves the issues. If they persist, feel free to leave a comment.

Best, @LechengKong

Hi LeCheng,

I appreciate your swift response and the confirmation that the previously reported bugs will be addressed., however, I encountered another one when I train the model.

LechengKong commented 10 months ago

Hi @FUTUREEEEEE ,

Looks like it is a runtime error, do you mind sharing the command you used so we can reproduce the error? Thanks

@LechengKong

FUTUREEEEEE commented 10 months ago

I use python run_cdm.py --override e2e_all_config.yaml.

LechengKong commented 10 months ago

It should be fixed by the most recent commit. No need to regenerate datasets.

Again thank you for helping us correct the repo. This project is still under rapid development and there are mismatching issues between our local version and online version, sorry about the inconvenience.