karthink / gptel

A simple LLM client for Emacs
GNU General Public License v3.0
1.06k stars 116 forks source link

Is there a limitation on context/promp length? #74

Closed JonatanSahar closed 6 months ago

JonatanSahar commented 1 year ago

I got the following error when sending the discussion section of a paper through gptel, but had no problem running it throught the web interface of chatgpt:

{"model":"gpt-3.5-turbo","messages":[{"role":"system","content":"You are a large language model living in Emacs and a helpful assistant. Respond concisely."},{"role":"user","content":"Discussion\nWe tested two opposing views on the mechanism of sensory processing in the auditory midbrain (IC) and auditory thalamus (MGB). In one view, sensory processing can be explained by habituation to local stimulus statistics (Figure 1C, h1), in the other by predictive coding (Figure 1C, h2). The study included a novel paradigm that orthogonalised local stimulus statistics and subjects expectations. We used ultra-high-resolution 7-Tesla fMRI optimised for imaging the IC and MGB. There were three key findings: First, mean BOLD responses in IC and MGB correlated with the subjects expectations of the probability of the stimulus occurrence but not with the local stimulus statistics. Second, events deviating from local stimulus statistics did not lead to increased responses in IC and MGB if subjects expected these events. Third, Bayesian model comparison showed that the responses of the majority of voxels in IC and MGB are best explained by a predictive coding model. Together, the findings indicate that sensory processing in auditory midbrain and thalamus are mostly driven by expectations of the subject and not by regularities in the local stimulus statistics.\n\nSeveral previous studies have interpreted response properties of subcortical sensory nuclei within a predictive coding framework (Font-Alaminos et al., 2020; Carbajal and Malmierca, 2018; Parras et al., 2017; Malmierca et al., 2015; Cacciaglia et al., 2015; Ulanovsky et al., 2003). These studies have, however, used designs where predictions were generated based on the regularities of the local stimulus statistics. Although mesoscopic responses to violation of abstract rules have been reported in the sensory cortex (e.g., N t nen et al., 1978; Paavilainen, 2013; Kok and de Lange, 2015; de Lange et al., 2018), they have not been reported in subcortical nuclei to-date. Our study breaks with a long tradition on research on subcortical SSA (Font-Alaminos et al., 2020; Parras et al., 2017; Robinson et al., 2016; Cacciaglia et al., 2015; Duque and Malmierca, 2015; Ayala et al., 2015; Cornella et al., 2015; Gao et al., 2014; Anderson and Malmierca, 2013; Ayala et al., 2012; P rez-Gonz lez et al., 2012; Zhao et al., 2011; B uerle et al., 2011; Antunes and Malmierca, 2011; Antunes et al., 2010; Anderson et al., 2009; Malmierca et al., 2009; Yu et al., 2009) by defining the predictions based on abstract rules that were orthogonal to the regularity of the stimulus local statistics. Only one study attempted to investigate the impact of abstract rules on SSA using alternating tone sequences in anaesthetised rats (Malmierca et al., 2019). They found that only around 5% of the measured units (comparable to the false discovery rate a=0.05\n of the study) showed deviant responses to violations of the abstract rules.\n\nA study on SSA in the rodent auditory system (Parras et al., 2017) where predictability was controlled using local stimulus statistics reported that structures at increasingly higher stages of the auditory pathway show increasing amounts of prediction error. The authors defined prediction error as the responses to sounds that deviate from the predictions in comparison to the responses to those same sounds when there were no available predictions. The authors concluded that the IC, MGB, and AC form a hierarchical network of prediction error. Although the studies use different paradigms in different species, a similar analysis can be done in our data by comparing the responses to the most unexpected deviant (dev4\n) with those for which no prediction is available; that is, the first standard in the sequences std0\n. Responses to dev4\n are higher than responses to std0\n in both, IC and MGB (Table 2 and Figure 3). This contrast with Parras results, where the IC showed little or no difference between the responses elicited by deviant and control sounds.\n\nNuclei in the auditory pathway are organised in primary (or lemniscal) and secondary (or non-lemniscal) subdivisions. The lemniscal division of the auditory pathway has narrowly tuned frequency responses and is considered as responsible for the transmission of bottom-up information; the non-lemniscal division presents wider tuned frequency responses and is also involved in multisensory integration (Hu, 2003). In the animal neurophysiology literature the strongest SSA is typically reported in non-lemniscal areas; that is, in dorsal and medial sections of the MGB (Antunes et al., 2010; Antunes and Malmierca, 2011; Duque et al., 2014) and the cortices of the IC (P rez-Gonz lez et al., 2012; Gao et al., 2014; Duque et al., 2014; Ayala and Malmierca, 2015; Ayala and Malmierca, 2018). Subdivisions of IC and MGB are notoriously difficult to assess in humans in vivo because of their small size and deep location within the brain (Moerel et al., 2015; Mihai et al., 2019). Nevertheless, our results showed that the SSA index had comparable distributions in the ventral and dorsal subdivisions of the MGB (Figure 5A). Moreover, our results showed that MGB regions driven by the predictive coding model were predominant in the ventral (lemniscal) tonotopic gradient of the MGB (Mihai et al., 2019) as well as in the rest of the MGB. Regarding the IC, there is to-date no available anatomical or functional atlas delimiting its central section (lemniscal) from its cortex (non-lemniscal). Nevertheless, our results show that the predictive coding model is the most likely generator of the data across the entire nuclei. We therefore assume that predictive coding underlies encoding of both, lemniscal and non-lemniscal subdivisions of the IC and MGB.\n\nThis fundamental difference with the animal literature might stem from a number of reasons. First, our design involved an active task: lemniscal pathways might only be strongly modulated by predictions when they carry behaviourally relevant sensory information. Second, the modulation of the subcortical pathways might be fundamentally different in humans compared to other mammals. Last, given the strength of the SSA effects reported in this study, it is possible that regions with weak SSA might have been contaminated with signal stemming from areas with strong SSA due to smoothing and interpolation necessary for the analysis of fMRI data.\n\nIt is tempting to hypothesise that the predictions on the sensory input that drive the subcortical responses in our experiment are generated in the cerebral cortex. This hypothesis would be consistent with the strong feedback connections from cerebral cortex to the subcortical sensory pathway (Winer, 1984; Winer, 2005). It would also be consistent with the results from animal studies where the deactivation of unilateral auditory cortex (B uerle et al., 2011) or the TRN (Yu et al., 2009) led to reduction of SSA in the ventral MGB (but also see contradictory findings in non-lemniscal MGB, Antunes and Malmierca, 2011, and non-lemniscal IC, Anderson and Malmierca, 2013). Our paradigm was optimised to study prediction error rather than the generation of such predictions, and we lacked the resolution to study cortical responses in enough detail as to disentangle activity representing predictions from activity representing prediction error. Thus, although it is unlikely that subcortical sensory nuclei like the MGB or IC are able to generate predictions based on the task instructions, whether these predictions originate in the cerebral cortex remains an open question.\n\nHigher BOLD responses to attended in contrast to unattended sounds are present in auditory cortex (Lee et al., 2014; Paltoglou et al., 2011), and to a much weaker extend also in the IC (Rinne et al., 2007; Rinne et al., 2008; Varghese et al., 2015; Riecke et al., 2018). Our results showed that responses to fully expected deviants at position 6 (posterior probability of 1) are strongly attenuated with respect to responses to deviants in positions where standards might also occur. This strong attenuation might not only be interpreted in terms of predictive coding, but also additionally by attentional gain modulation: deviants with a posterior probability of 1 might not need to be examined as carefully as deviants with low posterior probability, because its occurrence is guaranteed by task design. Two independent arguments support the interpretation that predictive coding underlies our results. First, although both conditions dev4\n and dev5\n required full attention of the participants and are thus not affected by any potential changes in the attentional state of the subject, BOLD response differences for these two conditions had strong effect sizes, ranging from d=-1.36\n to d=-0.69\n (see Table 2).\n\nSecond, our results showed that deviance responses were virtually abolished for dev6\n (Table 2). From previous work in animals, we know that deviance detection is salient even in anaesthetised animals (Malmierca et al., 2015) and effect sizes of SSA in the IC are comparable in the awake and anaesthetised mouse (Duque and Malmierca, 2015). Using fMRI in humans, Cacciaglia and colleagues (Cacciaglia et al., 2015) showed deviance detection in the human subcortical auditory pathway in passive listening conditions. Despite the much lower BOLD sensitivity of their experimental setup in comparison to ours, they reported a t-statistic for the deviant versus repeated standard contrast (in the e.g. left IC) of t11=5.24\n, corresponding to an effect size of d=3.15\n. In contrast, our effect sizes for the dev6\n versus std2\n contrast range from d=0.26\n (left IC) to d=-0.74\n (right MGB; Table 2). If the dev6\n response in our study was influenced by lack of attention, we would have still expected similar deviance responses as in Cacciaglia and colleagues s passive listening design. Only by interpreting the BOLD responses in our data as a correlate of predictability to abstract rules we can explain why we measured similar responses to dev6\n and std2\n in our paradigm.\n\nThe present study focused on auditory sensory pathway nuclei. Stimulus-specific adaptation at early stages of the sensory pathways has, however, also been reported in the visual (Dhruv and Carandini, 2014), olfactory (Fletcher and Wilson, 2003), and somatosensory (Maravall et al., 2013) pathways. Predictive coding serves to optimise the dynamic range of sensory systems (Brenner et al., 2000), and to maximise information transmission in the neural code by reducing the responses to expected stimuli (Fairhall et al., 2001) and to redundant portions of the incoming sensory signal (Huang and Rao, 2011). We speculate that abstract expectations are used as well in other sensory modalities to facilitate sensory processing in subcortical sensory nuclei.\n\nGiven the importance of predictive coding on sensory processing (e.g., Sohoglu and Davis, 2016; Davis and Johnsrude, 2007), atypical predictive coding in the subcortical sensory pathway is expected to result in profound repercussion at the cognitive level (McFadyen et al., 2020). For instance, individuals with developmental dyslexia, a disorder that is characterised by difficulties with processing speech sounds, have altered adaption dynamics to stimulus regularities (Perrachione et al., 2016; Ahissar et al., 2006; Chandrasekaran et al., 2009), altered responses in the left MGB (D az et al., 2012; Chandrasekaran et al., 2009), and atypical left hemispheric cortico-thalamic pathways (M ller-Axt et al., 2017; Tschentscher et al., 2019). Understanding the mechanisms underlying SSA and its relation to sensory processing in subcortical sensory pathways could have valuable applications in clinical contexts.\n\nGive a short overview of the above, what were the main results and their significance?"}],"stream":null,"temperature":1.0} error in process sentinel: json-read: End of file while parsing JSON error in process sentinel: End of file while parsing JSON

karthink commented 1 year ago

While there is a limitation (4K tokens for the default gpt-3.5-turbo model), a JSON error is a strange way for it to fail.

Could you try with a model that has a larger limit, by setting gptel-model to (for instance) gpt-3.5-turbo-16k?

JonatanSahar commented 1 year ago

I get the same JSON error.

On Sun, 18 Jun 2023 at 22:29, karthink @.***> wrote:

While there is a limitation (4K tokens for the default gpt-3.5-turbo model), a JSON error is a strange way for it to fail.

Could you try with a model that has a larger limit, by setting gptel-model to (for instance) gpt-3.5-turbo-16k?

— Reply to this email directly, view it on GitHub https://github.com/karthink/gptel/issues/74#issuecomment-1596243980, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEL3S3GPTEKKS4JWJAKG6ZTXL5JKVANCNFSM6AAAAAAZE36S6Y . You are receiving this because you authored the thread.Message ID: @.***>

karthink commented 1 year ago

I would like to reproduce this error. Could you paste the full prompt you used here?

JonatanSahar commented 1 year ago

Sure, here is an example:

Self-suppression refers to the phenomenon that sensations initiated by

our own movements are typically less salient, and elicit an attenuated neural response, compared to sensations resulting from changes in the external world. Evidence for self-suppression is provided by previous ERP studies in the auditory modality, which have found that healthy participants typically exhibit a reduced auditory N1 component when auditory stimuli are self-initiated as opposed to externally initiated. However, the literature investigating self-suppression in the visual modality is sparse, with mixed findings and experimental protocols. An EEG study was conducted to expand our understanding of self-suppression across different sensory modalities. Healthy participants experienced either an auditory (tone) or visual (pattern-reversal) stimulus following a willed button press (self-initiated), a random interval (externally initiated, unpredictable onset), or a visual countdown (externally initiated, predictable onset—to match the intrinsic predictability of self-initiated stimuli), while EEG was continuously recorded. Reduced N1 amplitudes for self- versus externally initiated tones indicated that self-suppression occurred in the auditory domain. In contrast, the visual N145 component was amplified for self- versus externally initiated pattern reversals. Externally initiated conditions did not differ as a function of their predictability. These findings highlight a difference in sensory processing of self-initiated stimuli across modalities, and may have implications for clinical disorders that are ostensibly associated with abnormal self-suppression. What is this study about? What are the main findings?

On Fri, 23 Jun 2023 at 23:24, karthink @.***> wrote:

I would like to reproduce this error. Could you paste the full prompt you used here?

— Reply to this email directly, view it on GitHub https://github.com/karthink/gptel/issues/74#issuecomment-1604904042, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEL3S3DHSMBW73YA2JE6FI3XMX3Q3ANCNFSM6AAAAAAZE36S6Y . You are receiving this because you authored the thread.Message ID: @.***>

karthink commented 12 months ago

This prompt works fine for me. The discussion section of the paper in your opening comment looks much longer, could you give me that text instead?

karthink commented 7 months ago

An issue with Curl's handling of large prompts was recently fixed in #137. Are you still facing this issue?

karthink commented 6 months ago

This problem should be resolved by #137. If you still experience it please reopen this issue.