[Marked for Deprecation. please visit https://github.com/brain-score/language for the migrated project] Benchmarking of Language Models using Human Neural and Behavioral experiment data
add functionality to drop (or permute) single X dimensions to evaluate their relative contribution to downstream performance R^2? or final metric? (note: can look into Shapley Values)
add functionality to drop (or permute) single X dimensions to evaluate their relative contribution to downstream performance R^2? or final metric? (note: can look into Shapley Values)