Open d-monnet opened 9 months ago
PR #25 should solve this PR however I want to keep this and make it a branch just in case this outperforms #25 in larger or smaller models in GPU (To be tested )
PR #25 should solve this PR however I want to keep this and make it a branch just in case this outperforms #25 in larger or smaller models in GPU (To be tested )