UoB-DSMP-2023-24 / dsmp-2024-group-7

dsmp-2024-group-7 created by GitHub Classroom
0 stars 0 forks source link

Task 6: use word embedding to emcode the sequence and use the result as a input to train the deep learning model #22

Closed YOHOHO111 closed 3 months ago

YOHOHO111 commented 4 months ago

The result is all not ideal, the accuracy of mouse combined chain is about 0.60, the loss is about 1, but better in val_acc and val_loss, the accuracy is even worse in human combined chain, which is about 0.30 and 2 in loss, and val_acc and val_loss is even worse.

YOHOHO111 commented 4 months ago

Also, i tried the sliding window to cut the sequence, but got too much result, and i haven't find a way to deal with them, and it also cannot used to train the data directly.

YOHOHO111 commented 4 months ago

try using v,j segm and mhc as feature, using kmer to encode the cdr3 sequence, combine all the feature to a matrix, get a better result with the new matrix

YOHOHO111 commented 4 months ago

simplify the model structure by replacing RNN with DNN